DOCUMENT RESUME 



ED 334 ?05 



TM 016 602 



AUTHOR 
TITLE 



INSTITUTION 

SPONS AGENCY 

PUB DATE 
CONTRACT 
NOTE 



PUB TYPE 



Linn, Robert L.? And others 

Quality of Standard Tests. Final Report, scnool 
Reform Assessment Project. Final Deliverable — October 
1989. 

Center for Research on Evaluation, Standards, and 
Student Testing, Los Angeles, CA. 

Office of Educational ResearcH and Improvement (ED), 
Washington, DC. 
Oct 89 

OERI-G-86- 0003 

I55p.; Paper prepared in collaboration with the 
University of Colorado; NCRC, university of Chicago; 
and Arizona state University. 
Reports - Research/Technical (143) ~ 
Tests/Evaluation Instruments (160) — Collected Wor)cs 
- General (020) 



EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



ABSTRACT 



MF01/PC07 Plus Postage. 

Achievement Gc.ins; Achievement Tests; Educational 
Assessment; * Educational Change; Elementary Secondary 
Education; *Grade Inflation; Mathematics Tests; Norm 
Referenced Tests; Quality Control; Reading Tests; 
School Districts; *Scoring; -Standardized Tests; 
State Norms; *state Officials; state Programs; state 
Surveys; Testing Problems; Testing Programs; *Test 
Use 

Lake Wobegon Phenomenon; Teaching to the Test 



Norm-referenced test results reported by states and 
districts were studied along with factors related to those scores to 
document the degree to which ••above average" achieveuje'-t test results 
are being presented. Part of the stimulus for the present report caune 
from the study by J. j. cannsll «?nd the community group Friends of 
Education. A letter and data collection form were mailed to directors 
of testinq in all states to obta-,n test score information. These 
directors were then interviewed by telephone for further information 
about testing. A stratified random sample of 175 school districts was 
surveyed by mail, with telephone interviews conducted with a 
subsample- Weighted estimates from the district sample suggested that 
57% of students in grades 1 through 6 obtained scores above the 
national median on norm-referenced reading tests, and 62% was the 
corresponding figure for machematics tests. Corresponding figures for 
grades 7 through 12 were lower, but still over 50%. State results 
were compatible with district estimates, providing some support for 
the general finding of cannell's group that for elementary school 
grades almost all states and most of the districts are reporting 
norm-referenced achievemenr test results that are above the national 
median. Possible explanations and suggestions for test use are 
reviewed. Seven tables, 33 figures, and a 39-item list o,. references 
are included. Seven appendices present forms used in the study and 
descriptive data. (SLD) 



ERIC 



'1^ fli Tr - 



U 8 DCFAfmtf NT or eOtfCATtOM 
El UCATIOML ftf SOURCES JNFORMATlON 

r Mtrtof cnartgeR nave b«^n rnnct© to tmprow^ f-' 

• f*o»nt* of Of 00*1^*0^% »t»te{? tn thij (Joe v 
Oe R» 0OSftK>« DC »»CV f 

r 




Center for Research on Evaluation, 
S tanda rds , and Student Testing 

Final Deliverable - October 1989 



Sc'r.col Reform Asses srr.ent Projec*: 

Quality of Standard Tests: 
Final Report 




BEST COPY AVAILABLE 



Center for Research on Evaluation, 
Standards, and Student Testing 

Final Deliverable - October 1989 

School Reform Assessment Project 

Quality of Standard Tests: 
Final Report 

Project Director: Robert Linn 
Grant Number OERI-G-86-0003 



Center for the Study of Evaluation 
Graduate School of Education 
University of California, Los Angeles 



comparing state and District Test Results to National Norms: 
Interpretations of scoring "Above the National Average" 



Robert L. Linn, M. Elizabeth Graue, and Nancy M. Sanders 

Center for Research on Evaluation, Standards, 
and student Testing 

University of Colorado at Boulder 



The project presented, or reported herein, was performed persuant 
to a grant from the Office of Education-il Research and 
Imprcpvement, Department of Education (OERI/ED) . However, the 
opinions expressed herein do not necessarily reflect the position 
or policy of the OERI/ED and no official endorsement by the 
OERI/ED should be inferred. 



ERIC 



4 



Acknowl edgement 



We tbank the many state and district directors of testing 
and other s.>taff who responded to our requests for data and took 
the time to complete the questicnnaries and participate in the 
telephone interviews. Without their cooperation this study would 
not have been possible. A number of students worked part time on 
various aspects of collection, organization, computer encry, and 
analysis of the data and we are greatful for that assistance. We 
especially thank Frank Beaty, William Carlin, Ann Carrington, 
Sharon Catto, Jolene Dehning, Roy Grimm, and Kelly Strong in this 
regard. We also thank the members of the project advisory 
committee (David Bayless, Stan Bernknopf, Gordon Ensign, H, D, 
Hoover, Richard Jaeger, Daniel Koretz, and Lynn winters) for 
their many helpful suggestions. Finally, we thank Eva Baker, 
Leigh Burstein, Stephen Dunbar, and Lorrie Shepard for their 
helpful comments on earlier drafts of this report. 



Comparing state and District Test Results to National Norms: 
Interpretations of scoring "Above the National Average" 

It has become commonplace for a state or district to 
report that its students are "scoring above the national 
average". Indeed, it has been suggested that all 50 states and 
most districts are reporting above average achievement test 
scores (Cannell, 1987). Is it really the case that all states 
claim that their students are performing above average on 
achievement tests? If so, how should such results be 
interpreted? 

These are two of several questions that motivated a study 
of norm-referenced test results that are being reported by states 
and school districts and factors related to those scores. This 
report presents part of the findings of that study. Published 
reports and results of mail and telephone surveys of states and a 
nationally representative sample of school districts were used to 
document the degree to which "above average" achievement test 
results are being presented. Analyses of the possible influence 
of the changing meaning of norms are also presented. Subsequent 
reports will address a number of other factors that may have an 
impact on the achievement test scores of states and districts and 
on the proper interpretation of those results. 

BACKGROUND 

Standardized achievement tests have long been used by 
schools to report student achievement to parents, policy makers, 
and the general public, in recent years, however, the attention 



ERIC 



6 



2 



given to test scores has increased dramatically. Low-stakes 
testing programs with results returned to teachers and reported 
in a low-key fashion to school boards and interested parents have 
given way to high-stakes testing programs that have direct and 
important effects on students, teachers, and school 
administrators. The increased emphasis on the use of test 
results for purposes of accountability has made questions of test 
quality and the trustworthiness of interpretations cf major 
concern to educators and policy makers. 

A major, albeit not the only or necessarily the best, way 
of providing the various audiences a means of interpreting test 
scores is to compare achievement test scores for a school 
building, a district, or a state to national norms. Slightly 
over half of the states and a substantial majority of the school 
districts rely on off-the-shelf, standardized achievement tests, 
for which normative comparisons provide a primary basis of 
interpretation. These comparisons take on a wide variety of 
forms, including the average grade equivalent score, the average 
normal curve equivalent score, the median percentile rank or 
percentile rank of the mean, the proportion of students scoring 
above the "national average", or more precisely, the national 
median, and the proport.'ons of students with "below average, 
average, or above average" scores whtre the three categories 
correspond to stanines 1 thru 3, 4 thru 6, and 7 thru 9, 
respectively. In each of these examples, national norms provide 
the primary basis of comparison. 



ERIC 



7 



Nonas, of course, are not the only basis of interpreting 
test scores. Some states and districts rely on criterion- 
referenced interpretations of either publisher- or locally- 
developed tests. In such cases, comparisons to past performance 
provide a key means of interpretation. For example, trends in 
the proportion of students passing a minimum-competency test, the 
proportion of students mastering specific objectives, or the 
average number of objectives mastered provide a means of 
comparing the current year's achievement with a benchmark. 
Trends may also be important in the interpretation of nor?"- 
referenced results, but the national norm still provides the 
major frame of reference for expressing the scores. Even states 
with locally-developed or customized assessment programs 
sometimes also use comparisons to national norms that are 
obtained through special equating studies or item response theory 
links to aid in the interpretation of their achievement test 
results. 

The pros and cons of noirmative comparison? have been 
discussed on many occasions. Discussions of appropriable and 
inappropriate normative interpretations are provided, for 
example, by Angoff (1971), Petersen, Kolen, and Hoover (1989), 
and in several introductory texts on educational and 
psychological measurement. Good discussions of appropriate and 
inappropriate uses and intet,->retations of norms may also be found 
in the technical manuals and interpretive guides provided by the 
publishers of the major standardized achievement tests. 



4 



Despite these discussions, normative interpretations 
continue to be misused and misinterpreted. The distinction that 
Angoff (1971) and others have made between the statistical 
meaning of "normative" which refers to "performance as it exists" 
and the use of the term to refer to "standards or goals 
performance" (p. 533) , is too often overlcoked. The fact that 
norms for school averages or district averages differ markedly 
from norms for irdividual students is too often ignored or given 
insufficient emphasis in interpretation. Because a school 
average is based on a range of student scores it necessarily 
falls somewhere in between the score of the highest scoring 
individual student and that of the lowest scoring student. 
Consequently, the distribution of school average scores is less 
variable than the distribution of individual student scores. The 
average achievement score that corresponds to the 70th percentile 
using school building norms, for example, may correspond to only 
the 60th percentile using norms for individual students. 

It is widely believed that some tests have "easier" norms 
than others. If the norms of test A are easier or less strinc-f-t 
than those of test B, then a given level of achievement would be 
expected to appear better (e.g., result in a higher percentile 
rank or a larger proportion of students scoring above the 
national average) with test A than with test B. Note that the 
difficulty of norms is different than the intrinsic difficulty of 
test items. A test that asked easy questions could have hard 
norms because the norming sample was unusually able in the 
content area of the test. Conversely, a second test that asked 



relatively more difficult questions could have easier norms 
because the norming sample for the second test included a 
disproportionate number of low achieving students. The relative 
difficulty of norms for a particular school, school district, or 
state may also depend on the degree to which the test content 
matches the curriculum at the building or classroom levels. 

The meaning of norms depends fundamentally on the 
definition of the reference population, and secondarily on the 
adequacy of sampling, the level of participation, and the 
motivation of the students in the norming sample, among other 
considerations. The year in which the norms were obtained is one 
of the important properties that define the reference population 
and it is clearly the case that norms become dated. If 
achievement is improving nationally, then the use of old norms 
will make a district or state appear to be doing better relative 
to the nation than would the use of current norms which provide a 
higher standard of comparison. 

Although the above concerns about the use of norms are 
hardly new, questions about the meaning and trustworthiness of 
normative comparisons that states and districts are using to 
communicate test results to policy makers and the public have 
recently taken on increased importance. The increased importance 
is due, in part, tu escalation in the stakes involved in testing. 
Concerns about normative comparisons were also exacerbated by the 
publication of a report by Dr. John J. Cannell (1987) entitled 
"Nationally Normed Elementary Achievement Testing in America's 
Public Schools: How All Fifty states Are Above Average**. 



6 



The Cannell report is based on a survey conducted by a 
community group, the Friends of Education, which found that "no 
state scores below the publisher's 'national norm' at the 
elementary level on any of the six major nationally normed, 
commercially available tests" (Cannell, 1987, p. 2, emphasis in 
original) . Based on this finding, Cannell concluded that 
"standardized, nationally normed achievement tests give children, 
parents, school systems, legislatures, and the press inflated and 
misleading reports on achievement levels" (p. 2) . 

Cannell was not the first to notice that states were 
reporting results that were above the national norm in greater 
numbers than would be expected based on past experience or 
common~sense notions of the likely relative standing of 
particular states. In 1984, the Southern Regional Education 
Board (SREB) reported that 9 of 11 SREB states with norm- 
referenced test results for elementary grades were at or above 
the national average (SREB, 1984). Two years later, "[ijn June, 
1986, SREB first described this situation in which student 
achievement in nearly all states was reported to be at or above 
the national averages as the 'Lake Wobegon effect' — descriptive 
of Garrison Keillor's mythical town where all children are above 
average" (Korcheck, 198S, p. 3). However, it was the Cannell 
report that placed the issue in the national limelight. 

The Cannell report attracted a good deal of attention in 
the press when it was released in the fall of 1987 and has been 
the focus of considerable debate and controversy among 
professional educators and measurement specialists ever since. 

ERIC ^ ^ 



There are undoubtedly a number of factors that helped focus 
attention on the findings. Dramatic statements regarding the 
findings such as those illustrated in the above quotes may be 
part of the reason. Interest in the report was probably enhanced 
also by the sharp criticisms of test publishers (owe believe 
inaccurate initial norms are the reason for higb scores**, p. 5, 
emphasis in original), of educators for the "integration of 
unchanging test questions into the curriculum** (p. 5, emphasis in 
the original) , of those responsible for reporting student 
achievement ("no state publication honestly described norm- 
referenced testing", p. 6), of university and public educat s 
serving as consultants to test publishers "who too often are mere 
sycophants, giving the commercial interests what they want" (p. 
9), and of the U, s. Department of Education, **who8e lack of 
knowledge o£ these tests constitutes nonfeasance** (p. 9, emphasis 
in original) . 

Even without the dramatic language and sharp criticism, 
however, the Cannell report raises serious questions and issues. 
The percentage of students reported to be scoring above the 
national 50th percentile in a number of states seems to defy 
common sense. 

The cannell report has been the focus of considerable 
discussion at national meetings and in prcfessionai journals 
concerned with issues of educational achievement and measurement. 
It was a major topic, for example, at the 1988 and 1989 Annual 
Assessment Conferences sponsored by the Educational Commission of 
the States. The report was featured along with six commentaries 



' 12 



from test publishers and representatives of the U, S. Department 
of Education in the Summer 1988 issue of Education al Measurement; 
Issues and Practice , The report also led the U. S. Dspartment of 
Education to arrange a meeting involving Dr. Cannell, 
representatives of major test publishers, and selected academics 
to discuss the fincxngs and cheir implications in February, 1988, 

Reviewers of the Cannell report {e.g., Drahozal & Frisbie, 
1988; Koretz, 1988; Lenke & Keene, 1988; Phillips & Finn, 1988; 
Quails-Payne, 1988; Stonehill, 1988; Williams, 19v8) identified a 
number of factors, some of which were also suggested by Cannell, 
that might contribute to the seemingly anomalous finding that all 
states are above the national average. The fact chat norms 
become dated was probably the most frciquently mentioned potential 
explanation. Differences in the rules for exclusion of students 
from testing in norming and in operational testing programs was 
also proposed as a possible explanation by several reviewers 
(e.g., Drahozal & Frisbie; Koretz; Lenke & Keene; Phillips & 
Finn) . Other suggested partial explanations included the 
possible effect of a closer match between the test and the local 
curriculum in operational testing programs than in norming 
samples (e.g., Koretz; Lenke & Keene; Phillips & Finn), and the 
possibilities thc.t poor security, familiarity with the specific 
content of tests that are reused year after year, or teaching the 
test may inflate scores (e.g., Drahozal & Frisbie; Koretz; 
Phillips & Finn) . 

Reviewers (e.g., Drahozal & Frisbie, 1988; Koretz, 1988; 
Lenke & Keene, 1988; Phillips & Finn, 9188; Williams, 1988) also 



13 



identified several shortcomings of the Cannell study and 
interpretations. The failure to distinguish between group and 
individual student norms in interpretations, aggregation bias 
* that results when the percent of districts with average scores 

above the national median is used to make inferences about the 
percent of students with scores above the national median, and 
the treatment of the percent of students at the 4th stanine or 
above as if it were an indicator of the percent of students above 
the national average are among the misleading analyses and 
interpretations that were identified. 

Despite these and other limitations, some reviewers 
concluded that Cannell 's major findings are still probably 
correct. Stonehill (1988), for example, stated simply that 
"Cannell's evidence is compelling" (p. 23). Others were more 
circumspect. Koretz (1988), for example, noted that "Dr. 
Cannell's errors are to some extent beside the point ... for they 
are not sufficient to call into question his basic conclusion" 
(p. 11) and Phillips and Finn (1988) stated that in the absence 
of "evidence to the contrary" they generally concurred with "the 
central finding of Dr. Cannell's report" (p. 10). 

PROCEDURE 

The Cannell study provided part of the stimulus for the 
present study. Certainly the issues raised in that study are 
important ones that deserve to be investigated in greater detail. 
Of particular concern were the issues of aggregation bias, the 
sampling of districts to obtain estimates for states without 
statewide testing programs that provide normative comparisons to 

ERIC 



10 



the nation, and the type of information obtained from districts. 
The Cannell study only asked districts whether their students 
were above or below the national average. More detailed district 
results would be more informative. Since the Cannell study did 
not include results for secondary schools, it wns also important 
uc expand the coverage to all elementary and secondary school 
grades. 

Our interest, however, was in more than simply obtaining 
estimates of the number of states or the proportion of districts 
that report achievement test resuJts that are above the national 
median or that have average achievement above the national mean. 
Such statistics are of interest, but are apt to raise more 
questions than they answer. It is evident that we also need to 
better understand the ways in which states and districts are 
using normative comparisons, the validity of those comparisons, 
and the factors that influt»ice both the results and the validity 
of test scores and their interpretation- Therefore, the present 
study was designed to collect data not only about the achievement 
scores that are reported by states and districts, but on a 
variety of related issues, including the way in which test 
results are used (e.g., public reporting, grade retention, school 
incentives) , when and why the uses were initiated, how and when 
the test were adopted, and policies regarding test 
administration, test security and ths preparation of students for 
taking tests. The present report, however, is focused on the 
test results and the possible influence of changes in the 
strigency of norms over time. Other aspects of the project data 



13 



11 

are addressed elsewhere (e.g., Baker, 1989; Burstein, 1989? 
Shepard, 1989). 
State Survey 

* Two national mail and telephone surveys were conducted. 

In the first survey, a letter and a data collection form (see 
Appendix A) were mailed to the directors of testing in all 
states. As can be seen in the sample copy in Appendix A, the 
state testing directors were asked to provide test results in 
reading and mathematics for all grades (K through 12) for the 
three most recent academic years (1985-86, 1986-87, and 1987-88). 

If available, the states were asked to report the percent 
of students scoring above the national 50th percentile statewide. 
When this information was not available, the states were asked to 
report state means and standard deviations in reading and 
mathematics as well as the scores that correspond to the 25th, 
50th and 75th percentiles statewide. In addition to test score 
information, the states were asked to provide the name, edition, 
and form of the test used at each grade; the year the test was 
first used in the state; the year it was normed; the month of 
administration; and the way the scores are routinely reported, 
e.g., percent of students above the national median. The number 
of students enrolled, the number tested, and the number for whom 
scores were reported were also requested at each grade for each 
of the three years in question. 

Since much of the information we were seeking was already 
available in published reports, the State Directors of Testing 
were asked to send copies of reports containing the requested 

^ i HI 

ERIC 



12 



information. The reports served in place of the completed data 
collection forms if they contained the necessary information. 
Since information about how scores are communicated to the public 
and how they are interpreted by the press was relevant to our 
interests, copies of press releases and newspaper articles about 
test results were requested. 

Following the mailings, State Directors of Testing were 
contacted by phone to arrange telephone interviews. Detailed 
results of the telephone interviews are presented in other 
reports of study results (e.g., Shepard, 1989), hence only a 
brief description of the interview is presented here. 

A copy of thr. telephone interview guide is shown in 
Appendix B. In addition to clarification questions about testing 
data requested on the data collection forms, Testing Directors 
were asked questions about test use, test selection, the 
alignment of curriculum with the test, about time spent on 
teaching tested objectives, about objectives given less time as 
the result of the test, about guidelines for test preparation, 
about typical and extreme practices in preparing students to take 
tests, and about test security practices and experience. 
District Survey 

A stratified random sample of districts designed to be 
representative of the fifty states was selected. The 1980 census 
data were used to stratify school districts by region, size, and 
socio-economic status (SES). The definitions of the levels of 
three stratification variables are provided in Table 1. As can 
be seen in Table 1, the three stratification variables: region, 



17 



13 

size, and SES, had four, eight, and five levels, respectively. 
Thus a total of 160 cells were defined. The SES index, which is 
defined in Table 1, was used to rank the school districts and 
then to define five strata such that approximately 15% of the 
students were in each of the two extreme strata (low and high, , 
approximately 20% were in each of adjacent strata (above and 
below average) , and approximately 30% were in the average 
stratum. 

Five districts were randomly selected for each cell where 
a sufficient number of districts was available according to the 
1980 census. Five districts were available and selected for most 
cells, however, 15 of the cells were void and 39 of the cells had 
fewer than five districts. For example, there are ro high SES 
districts with enrollments of 100,000 or more in the 
North/Central region and there is only one low SES district with 
an enrollment of 100,000 or more in the East region. 

The first of the randomly-ordered districts in each of the 
145 non-void cells was selected for inclusion in the survey. 
Because achievement test results of large school districts have 
been the focus of considerable attention in recent years, we were 
particularly interested in obtaining better information about the 
achievement test results being reported by larger districts. 
Therefore, districts with enrollments of 50,000 or more were over 
sampled. With the over sampling of large districts, a total of 
175 districts were selected for the sample. Appendix C lists the 
number of districts selected per cell. 



ERIC 



14 



After districts were selected telephone calls were made to 
confirm that the district was still operating (had not, for 
example, been consolidated with another district since the 1980 
census) , to identify appropriate respondents who were responsible 
for the district testing program, and to obtain complete mailing 
addresses. Where a district no longer existed, the second listed 
district in the corresponding cell of the sampling design was 
selected as a replacement. Once addresses were obtained, letters 
(see Appendix D) and data collection forms were mailed. 

A subsample of the districts was identified for telephone 
Interviews to be conducted following the mail survey (see 
Appendix E for a description of the procedures used to identify 
the interview subsample) . Because telephone interviews were to 
be conducted with a subsample of the districts, two different 
letters requesting participation and two different data 
collection forms were sent to districts (see Appendix D) . The 
same basic test data that were requested from states were also 
requested for all districts. Districts in the mail-survey-only 
subsaF.ple were also sent a brief questionnaire covering some of 
the interviev questions about the use of test results and 
perceived effects of testing in the district (see Appendix D) . 
Districts in the interview subsample did not receive a 
questionnaire, but were asked questions shown in the interview 
guide in the telephone survey (Appendix D) . 

Follow-up letters were f mt to districts approximately 
three weeks and again six weeks after the initial mailing. If no 
response was received within 3 weeks after the second follow-up. 



19 



I 



15 

attempts were made to reach respondents by telephone and urge 
thero to respond to the survey. When district personnel declined 
to participate in the survey or could not be reached after 
• repeated telephone attempts, the reason for the non-participation 

was recorded, and a substitute district was selected from the 
appropriate cell in the sampling design. 

RESULTS 

States with Norm- Referenced Comparisons 

A total of 35 states provided results that allowed norm- 
referenced comparisons for one or more grades in a least one of 
the three years for which data were collected (1985-85, 1986-87, 
and 1987-88) . The remaining 15 states do not use tests with 
national norms. The 35 states for which norm-referenced 
comparisons were obtained are listed in Table 2 along with an 
indication of basis for the comparison and the grades for which 
test results are reported. The basis for comparisons to national 
norms for states that administer an off-the-shelf, norm- 
referenced test is obvious. However, in order to obtain 
estimates of the percent of students scoring above the national 
median or the percentile rank of the state mean or median test 
score it was sometimes necessary to convert scores from the form 
in which they were reported. For example, if the state reported 
a mean grade-equivalent scores, those scores were converted uo 
the corresponding percentile rank by reference to the test 
publisher's norms tables for individual pupils. 

Several of the states listed in Table 2 obtain normative 
comparisons indirectly by linking non-normed tests or state 

ERIC f 20 



16 



assessment results to a norm~re£erenced test through the use of 
special equating studies or the inclusion of norm- referenced test 
items with known item parameters in a customized test (see, for 
example, Yen, Green, & Burket, 1987, for a discussion of 
customized tests) . States for which norm-referenced comparisons 
are obtained indirectly through such linkages are indicated in 
Table 2 by the word "LINK" in column showing the basis of 
comparison. 

Although comparisons to national norms either directly or 
through an equating link can be obtained for a total of 35 states 
in all, the number of comparisons varies substantially by grade 
level. As can be seen in Table 2, the largest number of states 
with results for any single grade is 22 at grade 8. Grades 3, 
with 20 states, and 6, with 18 states, are used for statewide 
testing nearly as often as grade 8. However, there is no grade 
for which normative comparisons are available for a majority of 
the 50 states. Test results are reported by only 10 or 11 states 
at grades 1, 2, 9, and 10, and only 5 states reported normative 
test results for grade 12. 

Where possible, estimates of the percent of students in a 
state who scored above the national median were obtained 
separately for each grade tested in reading and mathematics. 
Where estimates of the percent of students above the national 
median could not be obtained, the state median percentile rank or 
the percentile rank corresponding to the statewide mean was used. 
Note that here, ant throughout this report, it is the individual 
pupil norms, rather than norms for school buildings or school 



ERIC 



21 



17 

districts, that were used to determine percentile ranks. For 
some states, estimates of both the percent of students above the 
national median and the median percentile rank or percentile rank 
of the statewide mean were available and used. 

The number of states and the number of students for which 
estimates of the percent of students above the national median 

re obtained are reported in Table 3 by year of test 
administration, test content, and grade. Parallel numbers are 
reported in Table 4 for states where estimates of the median 
percentile rank or the percentile rank of the statewide mean were 
obtained. The latter numbers were also used to obtain weighted 
mean percentile ranks for the states for which those results were 
obtained. In many cases the number of states and number of 
students in Tables 3 or 4 are the same for mathematics as for 
reading, due to the fact that both content areas were usually 
tested and a single number of students tested was reported for 
both tests. However, there are some differences, e.g., grade 8 
in Table 3, because results were available in reading but not 
mathematics for a given state. 

Percent of students Above Nat ional Median . The combined 
results for states for which the percent of students scoring 
above the national median are summarized in Figure 1. The 
percents shown in Figure 1 are weighted by the number of students 
tested in each grade for the states reporting data for each of 
the three years for which data were collected. Thus each bar in 
the figure represents the percent of students in the states that 
provide data in this form who scored above the national median 



I 



18 

for a given school year and a given grade in either reading or 
mathematics. For example, the first column for grade 1, 1985-86 
is based on the 281,734 first grade students in the 7 states (see 
Table 3) who reported test results in this form and shows that 
54% of those students scored above the national median in 
reading. 

The results in Figure 1 are consistent with the general 
results reported by Cannell (1987) in that the overall percent of 
students above the national median is greater than 50 in all of 
the elementary grades :.n both reading and mathematics for each of 
the three years studiad. The percentage above the national 
median is usually greater for mathematics than for reading. 
Percentages are usually higher for elementary than secondary 
grade levels. For grades 1 thru 6, the percentage of students 
scoring above the national median in mathematics ranges from a 
low of 58% in grade 4 for the 1985-86 school year to a high of 
71% in grade 2 for the 1987-88 school year, whereas the 
corresponding range for reading is from 52% (grade 5, 1985-86) to 
60% (grade 3, 1987-88). For grades 7 through 12, the percentage 
of students scoring above the national median ranges from 49% 
(grade 12, 1985-86) to 60% (grade 11, 1986-87) in mathematics and 
from 48% (grade 9, 1986-87) to 55% (grade 8, 19S5-86) in reading. 

It should be noted that while the percentages displayed in 
Figure 1 are generally above the naive expectation of 50%, many 
individual students are, in fact, receiving scores that are well 
below the national median. If a state reports that 55% of its 
students have scores at or above the national the national 

23 

ERIC 



19 

median, for example, it is obviously tne case that the remaining 
4 5% of the students in the state are receiving scores below the 
national median. 

The results in Figure 1 provide only a very global picture 
since they combine the data for varying numbers of states at each 
grade level. They do not, for example, provide an indication of 
the variability from state to state. Some sense of the 
variability can be obtained from Figures 2 and 3 which show the 
distributions of the percent of students above the national 
median in reading and in mathematics, respectively. 

The data for the most recent year available for a state 
were used for the distributions in Figures 2 and 3, which for 
most states was the 1987-88 school year. Each point in Figures 2 
and 3 represents the percent of students in a state who scored 
above the national median in a particular grade. 

As can be seen in Figure 2, there is considerable 
variability from state to state. The tendency for the percents 
to be greater than 50 is quite evident for the elementary grades. 
However, there are some cases where the percent is substantially 
below 50. It should be noted that the point in Figure 2 that is 
most out of line with the Cannell (1987) results is the grade 4 
reading point that corresponds to a state where only 33% of the 
students were reported to have scored above the national median. 
This state introduced a statewide test in 1987-88 and hence was 
not included in the results reported by Cannell. 

The results shown in Figure 3 for mathematics show even 
greater state-to-state variability than was seen for reading. 



ERIC 



?4 



20 



Consistent with the global results in Figure 1, the tendency for 
the percents to be above 50 is more evident in mathematics than 
in reading. Some of the percents in Figure 3 are extraordinarily 
high. Note, for example, grade 2 where one state reported that 
86% of the students scored above the national median. The only 
two examples of a state where the percent is below 50 for grades 
1 through 6 — the 41% at grade 4 and the 49% at grade 6 — are 
both for the state that introduced s' -:ewide testing in 1987-88 
and therefore was not included in Cannell's state-level data 
collection. 

Median Percentile Ranks or Percentile Rank o f State Means. 
Since the percent of students scoring above the national median 
could not be estimated for all states, the median percentile 
ranks or percentile ranks of state means were also analyzed. 
Figures 4, 5, and 6, which parallel Figures 1, 2, and 3, 
respectively, display the results of the latter analyses. In 
general, the results using these percentile rank statistics are 
quite similar to the results using the percent of students 
scoring above the national median. This is so despite the 
differences in the properties of the two statistics and the fact 
that the two sets of analyses are based on different, albeit 
overlapping, subsets of states. 

The conclusions that most states are reporting results 
above the national average, that the discrepancy is greater in 
mathematics than in reading, and that the discrepancy is 
generally greater in the elementary grades than in the secondary 
grades do not depend on the use of a particular metric (e.g., the 



25 



21 

percent of students above the national median) . The same 
conclusions are supported by the use of the median percentile 
rank for each state or the percentile rank of the state mean. 
* Normative Comparisons Based on District Results 

Data were obtained from 153 districts, or 87%, of the 
target of 175 districts. Appendix F provides a listing of the 
region, size, and SES of each of the 153 districts that returned 
questionnaires, provided reports on their testing programs, or 
completed telephone interviews. Districtwide norm- referenced 
test results were available for 148 of the 153 districts. For 
the remaining 5 districts, districtwide normative comparisons 
could not be obtained for the reasons indicated in Appendix F 
(e.g., only criterion-referenced results were available). 

Also shown in Appendix F are the grades where norm- 
referenced test results were reported for each district. The 
grades where the largest number of districts report norm- 
referenced test results are grades 3, 4, 5, 6, and 8, in which 
test results were obtained for between 118 and 123 districts. As 
was shown in Table 2, those grades, with the exception of grade 
5, are also popular choices for statewide norm-referenced 
testing. 

As was done for states, estimates of the percent of 
students in a district who scored above the national median were 
obtained for each grade tested in reading and in mathematics 
whenever possible. Where these estimates could not be oLtained, 
the district median percentile rank or the percentile rank 
corresponding to the district mean was used. 

ERIC 



22 



Estimates, based on the district data, of the pe?cent of 
students scoring above the national median in reading and 
mathematics for grades 1 through 12 are plotted in Figure 7. The 
percents plotted in Figure 7 are weighted by district size, 
region, and SES and thus are estimates of the percent of students 
nationwide at a given grade that score above the national median 
in reading or in mather?tics. The number of districts on which 
these estimates are based varies by grade. Tha number of 
districts reporting data *-.hat could b2 used for the estimates in 
Figure 7 was 57, 77, 89, 87, 88, 85, 70, 84, 61, 52, 49, and 21 
at grades 1 through 12, respectively. 

As can be seen, the estimated percent of students scoring 
above the national median is consistently above 50%. For grades 
1 through 6, at least 57% of the students are estimated to have 
scores above the national median in reading. For mathematics, at 
least 62% of students are estimated to be sbove the national 
median grades 1 through 6. In grades 9 thru 12 the estimates of 
51 or 52% for reading are closer to 50%, however, with the 
exception or grade 12 witl n estimate of 54%, the percentage of 
students estimated to have scorer above l:he national mefiian in 
mathematics is 56% or higher in every grade. Although 56% is 
obviously greater than 50%, it is still the case that nearly half 
the students (44%) are receiving score reports below the national 
median when 56% are scoring above the median. 

Figure 8 presents results parallel to those in Figure 7 
based on the data from districts where estimates of median 
percentile ranks or the percentile ranks of the difstrict means 

o 27 

ERIC 



23 



were obtained. The weighted means of these percentile rank 
statistics are based on substantially fewer districts at each 
grade (number of districts equals 17, 27, 34, 29, 31, 27, 26, 29, 
15, 16, 15, and 4 at grades 1 through 12 respectively). 
Nonetheless, the results in Figure 8 lead to conclusions that are 
essentially the same as those based on the estimated percent of 
studepts above the national median, Wxth the exception of grade 
12, where the number of districts reporting data in this form is 
extremely small, all of the weighted means are greater than 50. 
The results for the elementary grades are higher than those for 
the upper grades and the results for mathematics are higher than 
those for reading. 

In addition to providing overall estimates of student 
performance levels, the district results provide a basis for 
investigating between-district variability anc* characteristics of 
districts associated with level of performance. Estimates of the 
percent of students who scored above the national median in 
reading and mathematics were obtained for a majority of the 
districts that returned test results. Distributions of these 
percents for districts were inspected at each grade level in both 
content areas. Since che complete distributions for all grades 
are rather voluminous, distributions for only one grade are 
presented and discussed in detail. Summaries of the 
distributic"*«$ for other grades are provided and complete 
distributions for grades 1 through 12 are included in Appendix G, 
however. Grade 3 was chosen for illustrative purposes since it 



24 



is the earliest of the grades that are most frequently tested and 
reported by districts in the sample. 

A total of 123 districts reported norm-referenced test 
results for grade 3. Eighty-nine of those dist'-icts provided 
data that could be used to estimate the percent of students 
scoring above the national median in reading and mathematics. 
The remaining districts reported data that could be used to 
obtain the radian percentile rank or the percentile rank of the 
district mean but did not provide a basis for obtaining the 
percent of students scoring above the national median. 

Distributions of district percents of students scoring 
above the national median are illustrated by the stem-and-leaf 
plots in Figure 9, The "stem" corresponds to the tens digit of 
the percent of students in a particular district that scored 
above the national median. The "leaf" reports the units digit 
for a district's percent. The results for each district are 
depicted by a leaf, i.e., a single digit under the leaf column, 
that is associated with a particular stem which gives the tens 
digit for each lea; in tha*- row. For example, one district 
reported that 93% of its students scored above the national 
median in reading and one district reported that 94% of its 
students scored above the median. Those two districts are 
depicted in the upper-left-hand corner of Figure 9 by the "34" 
under the leaf column next to a stem of 9, The lowest percent 
above the median for reading that was reported by a district was 
15%. The results for that district are indicated by the leaf of 



25 

5 next to a stem of 1 toward the bottom of the stem->and-leaf 
diagram for Reading. 

As can be seen in Figure 9, a majority of the districts 
' reported that 50% or more of their students scored above the 

national median in both reading (61 of 89 districts) and 
mathematics (69 of 89 districts). Only 16 of the 89 districts 
reported that less than 40% of their students scored above the 
national median in reading, but there were 12 districts that 
reported that three-fourths or more of their students scored 
above the national median. In mathematics the results show even 
larger numbers of districts reporting a substantial majority of 
their students above the median. 

In order to summarize the distributions of district 
percents of students reported to have scored above the national 
median, the loth, 25th, 50th, 75th, and 90th percentiles of the 
distributions were obtained. For grade 3, those percentiles are 
reported at the bottom of the two columns of Figure 9. (Parallel 
results for the other grades are presented in Appendix G.) These 
figures ind-'cate, for example, that 10% of the districts reported 
that a 32% or fewer of their third grade students scored above 
the national median in reading. On the other hand, the 90th 
percentile of 78 iiidicates that 10% of the districts reported 
that over three- fourths of their third grade students scored 
above the national median in reading. 

The five selected percentiles (10th, 25th, 50th, 75th, and 
90th) of the district distributions of the percent of students 
scoring above the national median were computed for all twelve 

ERIC 



26 



grades. Those percentiles are shown in the box-and-whisker plots 
displayed in Figures 10 and 11 for reading and mathematics, 
respectively. Looking, for example, at the grade 1 box-and- 
whisker plot for reading in Figure 10, it can be seen that the 
loth percentile of for the 57 districts reporting data at grade 1 
was 35, indicating that one district in 10 reported that 35% or 
less of its students scored above the national median. From the 
remaining percentiles for the grade 1 reading results it can be 
seen that one district in four reported 45% or less of its 
students scored above the national median, half the districts 
reported 55% or less, three districts in four reported 66% or 
less, and nine districts in ten reported 81% or less. 

From an inspection of Figure 10, it can be seen that 
districts at the 50th percentile reported that more than half 
(54% to 58%) of their students scored above the national median 
in reading in grades 1 thru 8. Only at grade 10 did a district 
at the 50th percentile reported slightly less than half (48%) of 
its students scored above the national median in reading. For 
the elementary grades, the tendency to have more than half of the 
students in a district scoring above the national median is much 
stronger in mathematics (Figure 11) than in reading (Figure 10). 
In grades 1 thru 6, for example, the 25th percentile is equal to 
or above 50. In other words, three quarters of the districts 
have more than half their students scoring above the median. 
Moreover, half the districts have 59% or more of their students 
above the national median In mathematics for grades 1 thru 8. 



31 



27 

The percent of districts that have more than half of their 
students scoring above the national median should not be 
interpreted as a direct indication of the percent of students 
across di&tricts who are scoring above the median. It would be 
possible, for example, for a substantial majority of districts to 
have more than half their students above the median while less 
than half of all students across districts were above the median. 
Nonetheless, it is clear that it is more common for a district to 
report test results that are "above average" than ones that are 
"below average". 

T>ie district results provide support for the general 
finding nat it is more common to have students scoring above the 
national median than it is to have them scoring below the median. 
However, there are more exceptions to this rule, particularly in 
reading, than were suggested by the cannell study which reported 
169 of 188 districts were "above average". Five districts 
refused to provide the information and only 14 districts were 
classified as "below average" in the Cannell study. 

Cannell 's results were based on a telephone survey of the 
largest districts in the sixteen states where statewide results 
were unavailable. Districts were "asked if their elementary (1- 
6) total battery scores were above, at, or below the national 
average" (Cannell, 1987, p. 22). A district was called above 
average if four of six grades were above the national norm, and 
scores on reading, language, and math were used in cases where 
total battery scores were unavailable. 



o 

ERIC 



32 



28 



The greater frequency of districts with scores below the 
median suggested by Figures 10 and 11 than by the Cannell results 
is largely attributable to the difference in definitions. For 
example, one district that was classified as above average based 
on the Cannell study reported that for grades 2 through 6 the 
percents of students scoring above the national median in reading 
during the 1986-87 school year were 56, 47, 35, 44, and 48, 
respectively. While this district would appear to be "below 
average" based on these reading test results, it woiild appear to 
be clearly "above average" based on the corresponding percents 
for mathematics (64, 64, 54, 60, and 68, for grades 2 through 6, 
respectively) . In general, districts report a larger percentage 
of students above the national median when using total battery or 
mathematics scores than when using reading scores, 
sunuaary of state and District Results 

Clearly it is the exception rather than the rule for a 
state to report that its students, particularly its elementary 
school students are performing below the national average. 
Although it is somewhat more common for a district than a state 
to report that less than half if its students are scoring above 
the national median, a substantial majority of districts report 
that their students are performing above average (i.e., more than 
50% of the students are reported to be above the national 
median) . The tendency for students to score above the national 
median is especially strong in mathematics for grades 1 thru 8. 
Nonetheless, it should be noted that some districts report that 
substantially less than 50% of their students score above the 



29 



national median. At grade 3, for example, one district in ten 
reported that a third or less of its students scored above the 
national median in reading. 
Achievement Trends and Dated Norms 

Although both the state and district results are generally 
consistent with the Cannell and earlier SREB findings that 
achievement test results are more often above than below the 
national norm, they provide no real indication of the reasons 
that lead to this result. As was discusred earlier, a wide 
variety of factors have been suggested as possible explanations 
of the apparently high test results that are being reported by 
states and districts. General improvement in student 
achievement, at least at the elementary grades, is clearly one 
possibility. When there are upward trends in achievement, old 
norms are easier (i.e., they provide a lower standard of 
comparison) than new norms and thus a state or district whose 
students score at the current national average would score above 
the average defined by dated norms. 

Using the aggregate results for districts, the district 
percents of students scoring above the median in reading and in 
mathematics were related to the age of the norms used by 
districts at each grade (i.e., the number of years between the 
date of the test administration by a district and the date of the 
test norming by the publisher) . Table 5 lists the number of 
districts that provided information on the year that the norms in 
use were obtained and the percent of students scoring above the 
median for grades 1 through 12. Also shown in Table 5 are the 



34 



30 



mean age of the norms used by districts, the mean change in the 
percent of students scoring above the median for each additional 
year since the norms were obtained, and the estimated mean change 
in the percent that results from the use of old norms rather than 
current norms. 

As can be seen in Table 5, the average district that 
returned data was using norms that are 4 or 5 years old. 
Although most districts were using the most recent notjss 
available from the publisher for the test being used, there is 
still an average of 4 or 5 years between the date of test 
administration by the district and the date of norming because 
publishers have typically collected norms only about every seven 
years. With a single exception, the percent of students scoring 
above the median increased in both reading and mathematics with 
each additional year since the norms were obtained. The 
exception is for reading at grade xO, By using norms that are 4 
or 5 years old rather than current norms, assuming the latter 
were available, the percent of students scoring above the median 
is estimated to be higher in all but grade 10 in reading and in 
every grade for mathematics. For grades 1 through 8 the expected 
increase ranges from 2% to 9% in reading and from 6% to 11% in 
mathematics. Taking differences of the latter magnitude into 
account would largely eliminate the tendency for these districts 
to report results that are above the national median. 

Trepds Over Several Years for Selec ted States. The 
district results in Table 5 show that there is a relationship 
between the age of norms used and the level of achievement test 



31 



scores for the districts in this sample. These results are cross 
sectional, and there may be a variety of other district 
characteristics associated with the age of norms for the test 
used as well as the level of student achievement. Therefore, 
these results do not provide a sufficient basis for concluding 
either that older norms are easier than newer norms or that 
achievement has been going up. 

Figures 1 and 4, which were considered earlier, did 
present achievement test results for three years. Neither of 
these figures provides a very clear indication that achievement 
scores are going up or down during the three years for which data 
were collected. There is some suggestion from both of these 
figures that scores are going up in grades 1, 2, and 3. However, 
the direction of change is not only unclear at most other grades, 
but would be difficult to interpret in any event because the 
subset of states for which data were obtained changes somewhat 
from year to year. Furthermore, three years is too short a time 
interval to assess long-term trends. 

Though not a specific part of the data collection design, 
results included in the state assessment reports for some of the 
states made it possible to look at trends for longer time 
intervals. Achievement trends for four states are summarized in 
Figure 12. 

The upper-left-hand quadrant of Figure 12 shows a plot of 
the percent of students in one state (state A) scoring above the 
national median in reading and mathematics at grade 4 for each of 
the past six school years. During this interval a single test 

. 3t; 



32 



form of a single edition of a test was administered each year and 
results are based on comparisons to the 1980-81 national norms 
provided by the test publisher. As can te seen, the first year 
the test was administered, (1982-83) the percent of students 
scoring above the national median was well below 50 for both 
reading (41%) and mathematics (44%) . During each of the 
following five years these percents increased, most notably in 
mathamatics. In 1987- 88, 57% of the students scored above the 
national median in reading and 68% scored above the national 
median in mathematics. 

Similar results using the alternative statistic of the 
percentile rank in the individual pupil norms corresponding to 
the statewide mean test score are shown for another state (state 
B) in the upper-right-hand quadrant of Figure 12. As in the 
previous example, the results are shown for a six year period 
durincj which a single form of a single edition of a test was 
administered each year. Comparisons are to norms obtained in 
1978 in this case. Although the trend for state B is less steep 
than the one for state A and is based on a different metric, 
there is a clear upward trend during the six years in both 
reading and mathematics. 

The third example, state C, shewn in the lower-left-hand 
quadrant of Figure 12, uses an entirely different metric than has 
been considered so far. The plots for state C show the percent 
of students passing statewide minimum-competency tests in reading 
and mathematics for each of 7 years. In mathematics the percent 
passing was 95 in the first year and gradually increased to 98% 



37 



33 

over time. For reading, where there was more room for movement, 
the increases between the first and most recent years of test 
administration are more substantial. 

The final plot shown in the lower-right-hand quadrant of 
Figure 12 displays the percentile ranks of the state means in 
reading and mathematics based on individual pupil norms for grade 
3 in state D. The state D results not only span the longest time 
interval, twelve school years, but include a change in test 
editions within the period of time that is covered. A single 
form of a single edition of a test was used for the eight years 
starting in 1976-77 and running through 1983-84. The pattern for 
those first eight years is reasonably similar to the ones shown 
for the other three states in Figure 10. There is a consistent 
upward trend during those years. 

The feature of the plot for state D that most clearly sets 
it apart from the plots for the other three states in Figure 12 
is the sharp decline in percentile rank between the 1983-84 and 
1984-85 school years followed by increases over the next three 
years to bring 1987-88 results back to approximately where they 
were in 1983-84. As was previously indicated, during the 1984-85 
school year the new edition of the test was introduced and the 
same form of that edition was administered in each of the last 
four years covered in the plot of results for state D, Thus the 
sharp decline corresponds to the introduction of the new test 
edition. 

The sharp decline in performance relative to national 
norms that State D experienced when the new edition of the test 



34 



was introduced is not unique. Figures 13 and 14, for example, 
show the results for two large school districts that introduced 
new editions during the 1987-88 school year. As can be seen, 
both districts experienced large declines in the percent of 
students scoring above the national median between 1986-87 and 
1987-88. 

There are several possible interpretations of the trend 
results shown in Figures 12, 13, and 14. The most straight- 
forward interpretation of the trends in Figure 12 is that 
achievement in reading and mathematics for the grades in question 
improved rather steadily in all four states. The dip when a new 
edition was introduced in state D could simply reflect general 
increases in student performance across the nation which made the 
more recent norms associated with the newer edition more 
stringent than the norms associated with the older edition of the 
test. This same interpretation could also explain the dips in 
performance levels associated with a new test edition for the two 
districts shown in Figures 13 and 14. 

An alternative interpretation of these results, however, 
is that increases in test scores simply reflect increasing 
familiarity with a given test form and more focused instruction 
on the content of t^iat specific form. By administering the same 
form of a test for several years teachers are apt to become 
increasingly familiar with the specifics of the test content and 
alter instructional emphases to better match the content of the 
test. As indicated by Mehrens and Kaminski (1988) and by Shepard 
(1989) , test familiarity might influence instruction in a wide 

er|c 3^/ 



15 



variety of ways, ranging from practices that would generally be 
considered sound uses of test results (e.g., identifying and 
working on objectives where students show weaJcnesses) to those 
that most educators consider unethical (e.g., teaching the 
specific items on a test just prior to test administration) . 

It is not possible to distinguish between the possibility 
that the trends in Figures 12, 13, and 14 are due to improvements 
in achievement, to increased familiarity with the tests, or to 
some alternative explanation, solely from the results presented 
in those figures. However, other data can be brought to bear on 
the issues. In particular, the questionnaire and interview 
results which are discussed in other reports based on this 
project (e.g., Shepard, 1989) will speak to some of these issues. 
Only the question of whether norms are changing in difficulty 
with time due to increases in student achievement nationally will 
be considered here. 

Achievement Trends and changes in the Difficulty of Norms . 
National changes in achievement levels obviously lead to 
differences in the meaning of norms. During a period of 
declining performance such as the nation experienced in the 1960s 
and the first part of the 1970s (Harnischfeger & Wiley, 1975; 
Koret2, 1986? 1987), newer norms provide a less stringent 
standard of comparison than older norms. Koretz (1987), for 
example, estimated that during the period of the much publicized 
test score decline (roughly the early or mid 1960s to the mid 
1970s) "the average decline in grades six and above was large 
enough that the typical (median) student at the end of the 



ERJ.C 



4U 



36 



decline exhibited the same level of achievement as was shown 
before the decline by students at the 38th percentile" (p. 2). 
Thus a state or district using old norms in the mid 1970s could 
have appeared to be well below the national average when in fact 
their students were scoring at the then current national average. 
On the other hand, when performance on achievement test is 
increasing, newer norms become harder and the use of old norms 
can make a state or district appear to be above average that 
would have only average or bel.,w average scores in terms of 
current national norms. Clearly, national trends in achievement 
tests scores have importance for understanding normative 
comparisons. 

Although increases in test performance have not received 
as much attention as the decline of the 1960s and 1970s, several 
sources of evidence suggest that achievement test scores have 
been going up. Nt.tional Assessment of Educational Progress 
(NAEP) reports (e.g., Dossey, Mullis, Lindquist, & Chambers, 
1988; NAEP, 1985) indicate that there have been some incnases in 
both reading and mathematics between the early or mid 1970s to 
the mid 1980s. Based on his review of NAEP and data from several 
other tests, Koretz (1987) concluded that the decline in test 
scores ended with cohorts of students that entered school in th»* 
late 1960s and that subsequent cohorts of students "produced a 
sharp rise in scores on most, but not all, tests. In the 
majority of instances in which scores increased, the rise has 
been steady — with each cohort tending to outscore the preceding 
one — and often roughly as fast as the decline" (p. 2) . 



41 



37 

Norming studies conducted periodically for standardized 
tests also provide evidence regarding trends in national 
achievement, when a new edition of a standardized test is 
introduced, it is customary net only to collect new normative 
data for the new edition but also to equate the old and new 
editions of the test. The equatings make it possible to estimate 
the extent to vhich achievement has increased or decreased over 
the years between the norming of the two editions. In some 
cases, new norms are collected for a previously normed edition of 
a test, which again provides a means of comparing national 
performance on the test at two points in time. 

Several test publishers reported increases in achievement 
based on the results of their norming studies. crB/McGraw-Hill 
(1987), for example, noted when the norms for Form E of the 
California Achievement Tests (CAT) were reported and compared to 
the norms for the CAT Form C to which Form E was equated that 
"the CAT E norms are more difficult than the CAT C norms. This 
seems to indicate that students in 1984-85 wei.-e achieving at a 
higher level than in 1977, when CAT C was normed" (p. 3-34). 
Increases in performancp were reported when Form G of the Iowa 
Tests of Basic Skills (ITBS) was published. "Between 1977-78 and 
1984-85, the improvement in ITBS test performance more th-^n ?nade 
up for previous losses in most test areas. Composite achievement 
in 1984-85 was at an all time high in nearly all test areas 
(Hieronymus & Hoover, 1986, p. 148). Increases in performance 
have also been reported for the Stanford Achievement Test (SAT7) 
(Wiser & Lenke, 19* - and the Comprehensive Tests of Basic Skills 



42 



38 



(CTBS) (Rothman, 1988) and increases can be inferred from 
comparisons of the norms for the Metropolitan Achievement Tests 
(MATS) (Psychological Corporation, 1988) and norms for equivalent 
i .res on the previous edition of the I»AT (Prescott, Balow, 
Hogan, & Farr, 1978; 1986). 

Table 6 provides a summary of the changes in the 
percentile rank of achievement test scores that are at the 
national median at one of the two times that norms were obtained 
f c r the six most used standardized achievement tests. The 
numbers are estimates of the changes in national percentile rank 
in reading and mathematics between the two norming years 
indicated at the head of each column of the table. Also, shown 
for comparative purposes are estimated changes in national 
percentile ranks based on NAEP. 

As is indicated in the footnotes to Table 6, the numbers 
in each column of Table 6 are derived from different sources and 
involve different types of comparisons. In the case of the CTBS 
the comparison is between 1981 norms and estimates of 1987 norms 
for the same test form based upon a weighting of user data. The 
Stanford results are based on 1981-82 and 1986 norming studies 
for the same test form. The other published test comparisons 
involve norming studies for successive editions of the test 
battery. However, the numbers in Table 6 all have a similar 
interpretation. A positive number indicates that performance was 
higher when measured at the more recent of the two norming years 
indicated at the top of each column. For example, the number 14 
shown for reading achievement on the California Achievement Tests 



(CAT) in grade 2 indicates that an equa^;ed Form C or Form E score 
that would have placed a student at the national 50th percentile 
using the 1977 Form c norms would lead to a national percentile 



rank of only 36 using the 1984-85 Form E norms. The 14 shown in 
Table 6 is the difference between the percentile ranks of 50 in 
1977 and 36 in 1984-85. 

With the exception of the SRA Achievement Series the 
differences for grades 1 thru 8 are all positive, indicating that 
more recent norms are more stringent than older norms for five of 
the six tests. For grades 10 through 12 the differences are 
generally smaller th?.n those shown for the earlier grades and two 
of the four tests with results for the high school grades have 
some differences that are negative, indicating a decline in 
performance and therefore easier recent norms in those instances. 

The changes in percentile ranks shown in Table 6 are based 
on various time intervals between norming studies. More direct 
comparison can be made by dividing the changes in percentile 
ranks in Table 5 by the number of years between the norming 
studies to obtain estimates of yearly changes in percentile 
ranks. Such yearly changes in percentile ranks for grades 1 thru 
8 are presented graphically in Figures 15 and 16 for reading and 
mathematics, respectively. 

In general, the results in Figures 15 and 16 are fairly 
consistent with those based on the analyses of the district data 
that were reported in Table 5. The estimates of yearly changes 
derived from the district data are greater than those shown in 
Figures 15 and 16 for some tests but smaller than those for other 




ERIC 



40 

tests. The Table 5 estimates of changes in norm-referenced 
performance that would be expected due to a change in the date of 
the norms, however, are of the same order of magnitude as those 
shown in Figures 15 and 16. 

Although the NAEP trend results are based on age cohorts 
rather than grade cohorts, the NAEP results represent the best 
available independent means of estimating national changes in 
achievement. Changes in percentile ranks estimated from NAEP 
results between the 1974-75 and 1983-84 assessments for reading 
and between 1977-78 and 1985-86 for mathematics are plotted in 
Figures 17, 18, and 19 for 9, 13, ard 17 year olds, respectively. 
Also shown in these figures are the changes for the six norm- 
referenced tests at the modal grades for 9, 13, and 17 year olds, 
that is, grades 3, 7, and 11. 

As can be seen in these figures, the different data 
sources vary a good deal in the magnitude of change in 
performance. The NAEP results suggest either some increase in 
performance (ages 9 and 17 in reading and ages 9 and 13 in 
mathematics) or no change during the interval in question. The 
increases indicated by NAEP are smaller than those shown by some, 
but not all, of the standardized tests. Comparing the publisher 
grade 3 results with NAEP age 9 results (Figure 17), it can be 
seen that four of the six standardized tests show larger gains in 
reading and five of the six show larger gains in mathematics than 
would be estimated by NAEP. At age 13 (Figure 18) NAEP showed no 
change in reading and two of the standardized tests (SRA and 
Stanford) indicated only small changes at grade 7, but the 



ERIC 



45 



41 



remaining four tests suggested more substantial increases in 
performance. In mathematics, two standardized tests suggest 
smaller changes at grade 7 than NAEP obtained for 13 year olds, 
one standardized test shows a change similar to the one obtained 
by NAEP, and the remaining three standardized tests show larger 
gains in performance. At grade 11 or age 17 (Figure 19) , 
relatively little change is indicated by any of the data sources 
for reading and relatively small and inconsistent changes are 
indicated for mathematics. 

Of course, the dates of the first and second norroings are 
not the same for all the tests and the tests differ in content 
coverage and in the specifics of the samples on which the norms 
are based. Nonetheless, the different data sources give rather 
different answers in some cases to the question of the degree to 
which test performance has increased during the past decade. The 
discrepancy between increases suggested by NAEP and roost of the 
standardized tests raises questions about the possibility that 
artifacts may inflate the norm-referenced test results. 

One possible artifact is that the norms obtained for a 
standardized test may be biased due to differential participation 
rates in norming studies by school districts according to whether 
or not the districts are already using the standardized test 
being normed (Baglin, 1981). If school districts that are 
already using a standardized test are more likely to participate 
in the norming of a new edition of the test than districts using 
another publishers test, and if districts that are using a given 
test generally have curricula that match more closely the 



ERIC 



4(; 

r 



42 



objectives of both the nev and old editions of that test or 
emphasize those objectives because the test is used, then the 
norms could be more difficult. In other words, such an influence 
would run counter to the observed tendency for states and 
districts to report that more than 50% of their students score 
above the national median. 

To investigate the latter possibility, Wiser and Lenke 
(1987) compared the performance of user and non-user groups when 
the 1986 norms for the Stanford were obtained. They found that 
"users performed as well or better than non-users in all subject 
areas through grade 6." For grades 7 through x2 the results were 
more mixed with users performing better in some subject areas at 
some grades but non-users performing better for other 
combinations. 

Wiser and Lenke note that the comparison of particular 
interest in their results is between the 1986 non-users and the 
1982 norming sample. Since the Stanford 7 was a new edition at 
the time of the 1982 norming, the participants in the norming 
sample had not previously used the edition and were comparable in 
that sense to th& 1986 non-user sample. The 1982 sample and the 
1986 non-user samples were also matched on school ability as 
measured by the Otis-Lennon School Ability Test. Thus a 
comparison of the 1982 and 1986 non-user results provides an 
estimate of the change in achievement that i& uncontaminated by 
the familiarity that users have with the particular edition of 
the test. 



43 

We used the scaled score means and standard deviations 
reported by wiser and Lenke (1987) to calculate two estimates of 
the changes in average test scores in terms of 1982 standard 
deviation units for total reading and total mathematics. The 
first estimate is simply the mean for the full 1986 norming 
sample (users and non-users) minus the 1982 mean, all divided by 
the 1982 standard deviation. The second estimate is the 1986 
mean for non-users only minus the 1982 mean, all divided by the 
1982 standard deviation. The two sets of standardized 
differences are summarized in Table 7. 

For grades 1 and 2 the non-user group data results in 
estimates of the gain in achievement in reading between 1982 and 
1986 that are substantially smaller than the estimates based on 
the total noxrming sample. The gain in reading achievement 
appears to be about 40% smaller (i.e., 100x(.17-.l0)/.17) at 
grade 1 and about 70% smaller at grade 2 with non-user data than 
with the data from the total norming sample. This difference is 
consistent with the premise that familiarity with a test form 
leads to inflated estimates of achievement gains. However, large 
differences in estimates based on non-user and total norming 
sample data such as those for reading in grades 1 and 2 are not 
found consistently. 

The non-user estimates of standardized gains in reading 
achievement are smaller for the total-norming-group estimates iri 
grades 1 through 6 and grades 8 and 9, albeit by only a trivial 
amount at grade 3. The two sets of estimates are the same to two 
decimal places in grades 7 and 10, and the non-user estimates are 



o 

ERIC 



4S 



44 



actually larger than those based on the total norming sample at 
grades 11 and 12. For mathematics, non-user estimates of 
achievement gains are 20% or more lower than total group 
estimates only at grades 2, 2, 6, and 7, while they are larger by 
an equal percentage or more at grades 9, 11, and 12. 

Overall the wiser md Lenke results suggest that 
increasing familiarity with a particular test form may explain 
part of the apparent growth in norm-referenced test performance. 
The generally higher scores obtained by non-users in 1986 than 
were obtained in the 1982 norming of the then new edition of the 
test, however, suggest that there has also been some more 
generalized improvement in performance, particularly in 
mathematics. 

Results recently reported by Hoover (1989) for the Iowa 
Tests of Basic Skills (ITBS) suggest that much of the increase in 
performance on a test form may occur on the first operational 
administration of the form. From user data weighted to estimate 
national performance, Hoover estimated that approximately 55% of 
the students scored above the 1984-85 national median across 
grades 3 through 8 on the Battery Composite when forms G and H 
were first administered operationally in 1985-86. In the second 
and third years of operational administration the average percent 
of students across grades 3 thru 8 who scored above the 1984-85 
national median increased to 59% (1986-86) and then to 60% (1987- 
88) . 

The gains from year 1 to years 2 and 3 of operational use 
reported by Hoover may be attributable to a combination of real 

ERIC 4y 



45 



gains in achievement and increasing familiarity with a test form. 
The relatively large gain in the first year that the test is used 
operationally, however, may be due to a combination of several 
additional factors such as (1) the selection of a test that is 
most closely aligned with the state or district curriculum, (2) 
greater emphasis on the importance of good test performance when 
the test is used operationally than when it is normed, and (3) 
the exclusion of a l?rger fraction of less able students in 
operational test administrations than in norming studies. 
Indirect support for the latter explanation comes from Hoover's 
finding that only about 6%, rather than the expected 10%, of the 
students scored below the lOth percentile during the first year 
of operational administration of forms G and H of the ITB3. High 
scores (at or above the 90th percentile) , on the other hand, 
occur at the expected rate of 10% in the first year of 
operational test use. 

DISCUSSION 

Weighted estimates from the district sample suggest that 
at ]east 57% of the students in grades 1 through 6 are obtaining 
scores above the national median on norm-referenced reading 
tests. The corresponding figure for mathematics is 6:.%. The 
comparable figures for grades 7 through 12 are lower, but still 
somewhat greater than 50%. The state results are quite 
consistent with the district estimates. Thus, the results of the 
present study provide additional support for the general finding 
by cannell and by the SREB that for the elementary grades almost 
all states and the majority of districts are reporting norm- 

o 51) 

ERIC 



46 



referenced achievement test results that are above the national 
medi-n. 

While supporting Cannell's general finding that it is more 
common for a state or district to obtain test results for their 
students that are "above the national average", our analyses lead 
us to conclusions that are different, and certainly less 
sensational, than the ones he reached. To begin with, it is 
important to put the "above average" findings in context. Many 
students are receiving scores that are "below average" even in 
districts or states that are reporting substantially more than 
50% of their students are "scoring above the national average". 
When a district reports that 57% of its students obtained reading 
scores that are at or above the national median, for example, the 
other 43% of the students obviously scored below the median. It 
should also be emphasized that although most districts report 
results that are "above the national average", there are still 
many districts throughout the nation that are reporting results 
that are below average. One out of ten districts in our sample, 
for example, reported that a only about a third of its students 
at a given grade scored above the national median in reading. 

Cannell (1987) concluded that norm-referenced achievement 
tests are producing inflated reports from states and districts on 
the achievement of their students. But the finding that more 
than half the students are scoring above the national median that 
was obtained when the norms were established does not necessarily 
imply that the results are inflated. There are many factors that 
may lead to the general finding, but it seems clear that the use 



ERIC 



51 



of "old" norms is one of the major factors that contributes to 
the abundance of "above average" scores. 

The evidence reviewed provides strong support for the 
conclusion that norms obtained for grades 1 through 8 during the 
late 1970s or early 1980s are easier on most tests than more 
recent norms. Consequently, a state or district where the 
average student scores at the current national average will be 
accurately reported to be above the national average defined by 
norms that are several years old. It appears that a substantial 
fraction of the "Lake Wobegon" phenomenon may be attributable to 
the use of old norms. It should be noted that the use of "old" 
norms is not purposeful on the part of school districts or 
states; they generally use the most recent norms available. 
Since standardized tests are usually normed every seven years, 
the most recent norms available will, on average, be 3.5 years 
old in most school years. 

Concerns about dated norms have led to suggestions that 
publishers should produce current annual norms (e.g, Cannell, 
1988; Phillips and Finn, 1988) and publishers are now attempting 
to do this by obtaining weighted estimates of national results 
from user data (e.g., Rothman, 1988). As Shepard (1989) has 
pointed out, however, annual norms based on user data potentially 
have several serious defects. If users differ from nonusers in 
ways other than those reflected by the demographic variables used 
for weighting, then user-based annual norms may be worse than 
dated norms where there is at least an understood frame of 
reference. In particular, if test familiarity leads to higher 



48 



test performance, a state or district that changes publishers and 
administers a several year old test form for the first time would 
be at a disadvantage when compared to user norms {Shepard, 1989) . 

The alternative of conducting special national norming 
studies every year, or even every other year, is not a realistic 
or desirable possibility. Norming is not only expensive, but the 
quality of the results is very dependent on voluntary 
participation of schools and well motivated students. Current 
participation rates in norning studies conducted roughly every 
six or seven years by a publisher are already far lower than 
would be desired. More frequent attempts to norm tests would 
surely lower the participation rates still further and thereby 
degrade the quality of the norms. Finally, it should be noted 
that although more recent norms provide a more stringent standard 
of comparison when scores are going up as they have been during 
the last decade, they would provide a less stringent standard 
during periods of decline in scores such as that experienced 
between the mid 1960s and the mid 1970s. Thus, we do not believe 
that annual norms is an appropriate or effective way to deal with 
problems caused by dated norms. 

Emphasis needs to be given to the changing meaning of 
norms and the age of the norms that are used in any reporting of 
test scores. It obviously is not sufficient to report that 
"students in state X are scoring above the national average" 
without clearly indicating the year in which the norms were 
obtained. Simply noting the year of the norms is not enough, 
however. An explanation of the implications of shifting norms 



5 a 



49 

also needs to be provided along with an indication of what is 
known about recent trends in the stringency of national norms. 

There is ample evidence that scores on norm-referenced 
tests have been going up in grades 1 through 8 in recent years. 
But the more important question is: Has student achievement 
improved in recent years? Unfortunately, the answer to the 
latter question is equivocal. 

Achievement test scores are of interest to the degree that 
they enable valid inferences to be made about broader achievement 
domains. But little attention has been given to the issue of the 
degree ':o which valid generalizations about broad achievement 
domains can be made from state or district test results. 

Comparisons of the changes i.t norms of standardized tests 
with estimates of changes in achievement based on NAEP results 
suggest that test norms may be changing more rapidly than is 
student achievement as measured by NAEP. The Wiser and Lenke 
(1987) findings that apparent increases are generally smaller for 
non-users than for users of a given test series suggest that part 
of the apparent growth in achievement based on norm-referenced 
test results may be due to increased familiarity with a 
particular form of a test. Only part of the apparent gain can be 
explained in this way, however. 

The differences between the gains in performance indicated 
by NAEP and by norm-referenced tests, and between Wilier and 
Lenke' s total norming sample and their non-users, at the very 
least, suggest that caution is needed in interpreting gains in 
norm-referenced test scores as reflections of the amount of 



50 



improvement that has taken place in achievement, more broadly 
defined. More direct assessments of the degree of 
generalizability of results to other tests and to other 
indicators of student achievement are greatly needed, however. 

Hoover's (1989) finding that only about 6% of the students 
scored below the 10th percentile in the first year of operational 
administration of forms G and H of the ITBS suggests that roughly 
a third to a half of che difference between the percent of 
students scoring above the national median and the naive 
expectation of 50% may occur in the first year of use and be due 
to wnat happens with the least able students. This suggests that 
greater emphasis in reporting needs to be given to the lower end 
of the score distribution and to the students who are excluded 
from testing when -esults are reported by states or districts. 
It may be quite appropriate, indeed desirable, to exclude 
students with limited English proficiency or students receiving 
particular types of special education services from a norm- 
referenced test administration. Such students should not be 
ignored, however, when district or state achievement results are 
reported. At a minimum, the number of such students and the 
reasons for exclusion from testing should be reported. 

The practice of using a single form of a test year after 
year poses a logical threat to making inferences about the larger 
domain of achievement. Scores may be raised by focusing narrowly 
on the test objectives without improving achievement across the 
broader domain that the test objectives are intended to 
represent. Worse still, practice on nearly identical or even the 



51 



actual items that appear on a test may be given. But, as Dyer 
aptly noted some years ago, "if you use the test exercises as an 
instrument of teaching you destroy the usefulness of the test as 
an instrument for measuring the effects of teaching" (1973, p. 
89) . 

Current accountability pressures place great emphasis on 
test scores. It is unlikely that any single test, no matter how 
well constructed, normed, and validated, can withstand the 
pressures to serve both as an instrument of instruction anfi an 
instrument for measuring the effects of instruction. Making 
valid inferences about broad achievement domains from test scores 
has always been a challenging and difficult undertaking, but is 
made all the harder by current demands for accountability and the 
use of standardized test results as primary indicators of 
accountability. 



er|c 5r> 



52 

References 

Angoff, W. H. (1971). Scales, norms, and equivalent scores. In 
R. L, Thorndike (Ed.), Educational Measurement. Second 
Edition. Washington, DC: American Council on Education. 

Baglin, R. F. (1981). Does "nationally" normed really mean 
nationally? Journal of Educational Measurement. 97-107. 

Baker, E. L. (1989). Wh^t^s the vis?? Standardized tests and 

educational policy . Technical Report. Los Angeles, CA: UCLA 
Center for Research on Evaluation, Standards, and Student 
Testing (Grant No. OERI-G-86-0003) . May. 

Burstein, L. (1989). Looking behind the "average"; How are 
states reporting test results? Technical report. Los 
Angeles, CA: UCLA Center for Research on Evaluation, 
Standards, and Student Testing (Grant No. OERI-G-86-0003) . 
May. 

Cannell, J. J. (1987). Nationally normed elementary achievement 
testing in Americans public schools; How all 50 states are 
above the national average . Second Edition. Daniels, West 
Virginia: Friends for Education. 

Cannell, J. J. (1988a). Nationally normed elementary 

achievement testing in America's public schools: How all 50 
states are above the national average. Educational 
Measurement: Issu es and Practice . 1, No. 2, 5-9. 

Cannell, J. J. (1988b). The Lake Wobegon effect revisited. 

Educational Meas urement: Issues and Practice . 2, No. 4. 12-15. 

ERIC 



53 



CTB/McGraw-Hill. (1987). Technical Report; California 

Achievement Tests. Forms E and F. Levels 10-20. Monterey, CA: 
CTB/McGraw Hill. 

CTB/McGraw-Hill. (1988). CTB/McGray-Hill stu dies show students 
achieving at higher l evels in basic skills. CTB/McGraw-Hill 
press release. November, 1988. Monterey, CA: CTB/McGraw 
Hill. 

Dossey, J. A., Mullis, I. V. S., Lindquist, M. M., & Chambers, D. 

L. (1988). The mathematics report card. Are we measuring 

up? Trends and achievement based on the 1986 national 
assessment . Princeton, NJ: Educational Testing Service. 

Drahozal, E. C. & Frisbie, D. A. (1988). Riverside comments on 
the Friends of Education report. Education al Measurement; 
Issues and Practice . 7, No. 2, 12-16. 

Dyer, H. S. (1973). Recycling the problems in testing. 

Proceedings of the 1972 Invitational Conference on Testing 
Problems . Princeton, NJ: Educational Testing Service. 

Gardner, E. F. , Madeen, R. Rudman, H. C, Karlsen, B. Merwin, J. 
C, Callis, R., & Collins. (1983). Stanford Achievement Test 

ser4?St ymitU^vgl nQxms i?ooUetf Mtisnai- The 

Psychological Corporation, Harcourt Brace Jovanovich, Inc. 
Gardner, E. F. , Madeen, R. Rudman, H. C, Karlsen, B. Merwin, J. 
C, Callis, R., & Collins. (1987). Stanford 7 Plus. 
Multilevel norms bookle t. National . The Psychological 
Corporation, Harcourt Brace Jovanovich, Inc. 



54 

Harnischfeger, A. & Wiley, D. E. (1975). Achieveme nt test score 
decline; Do we need to worry ? Chicago: ML-Group for Policy 
Studies in Education. 

Hieronymus, A. N. & Hoover, H. D. (1986). Iowa Test s of Basic 
Skills. Forms G/H? Manual for school admini strators. Levels 5» 
il. Chicago, XL: The Riverside Publishing Co. 

Hoover, H. D. (1989) . Reactions to "Lake Wobegon one year 
later; Results from a replication of the Cannell study. 

4 

Symposium presented at 1989 Assessment Conference of the 

Educational Commission of the States. Boulder, CO: June. 
Korcheck, S. A. (1988). Measuring student learning; Statewide 

student assessment programs in the SREB states. Atlanta, GA; 

Southern Regional Education Board. 
Koretz, D. (1986). Trends in educational achievement. 

Washington, DC: Congressional Budget Office. 
Koretz, D. (1987). Educational achievement; Explanations and 

implications of recent trends . Washington, DC: Congressional 

Budget Office. 

Koretz. D. (1988). Arriving at Lake Wobegon; Are standardized 

tests exaggerating achievement and distorting instruction? 

American Educator. XZ, No. 2, 8-15, 46-52. 
Lenke, J. M. & Keene, J. M. (1988). A response to John J. 

Cannell. Educatio nal Measurement; Issues and Practice . 2, No. 

2, 16-18. 

Mehrens, W. A. & Kaminski, J. (1988). Using commercial test 
preparation materials for improving standardize d test scores: 
Fruitful, fruitles s or fraudulent? Paper presented at the 

ERIC 



55 



annual meeting of the American Educational Research 
Association, New Orleans: April. 

National Assessment of Educational Progress. (1985). His 
reading report card. Progress toward excellence in our 
«gc;hQols. Trends in reading over four natio nal assessments, 
1971~1984 . Princeton, NJ: Educational Testing Service. 

Petersen, N. S., Kolen, M. J., & Hoover, H. D. (1989). Scaling, 
norming, and equating. In R. L. Linn ' i.), ^(^^c^tional 
Measurement . Third edition. New York: Macroillan. 

Phillips, G. W. & Finn, C. E. Jr. (1988). The Lake Wobegon 
Effect: A skeleton in the testing closet? Educational 
Measurement: Issues and Practice . 2, No. 2, 10-12. 

Prescott, G. A., Balow, I. H. , Hogan, T. P., & Farr, R. C. 
(1978). Metropolitan Achiev ement Tests Forms JS and KS. 
Complete Survey Batterv. _Teacher^s manual for admini stering 
and interpreting . The Psychological Corporation, Harcourt 
Brace Jovanovich, Inc. 

Prescott, G. A., Balow, I. H., Hogan, T. P., & Farr, R. C. 

(1986). MATS. Metropolitan Achieveme nt Tests Forms L and M. 
Survey battery multilevel national norms booklet . The 
Psychological Corporation, Harcourt Brace Jovanovich, Inc. 
The Psychological Corporation. (1988). MATS to MATS Survey. 
Corresponding scaled scores of th e fifth and sixth editions of 
the Metropolitan Achievement Test Sur vey Batterv. The 
Psychological Corporation, Harcourt Brace Jovanovich, Inc. 

Quails-Payne, A. L. (1988). Educationa l Measurement; Issues and 
Practice . 7, No. 2, 21-22. 



60 



Rothman, R. (1988). Pupil gains not just mythical, test 

publisher says study shows. Education Week . November, 16, 
1988, p. 20. 

Science Research Associates. (1979). SRA Achievement Series 

Forms 1 and 2. Norms and conversion tables. Chicago: Science 

Research Associates, Inc. 

Science Research Associates. (1986). SRA Achievement Series 

Forms I and Answer kevs. norms and conversion tables. 

Chicago: Science Research Associates, Inc. 

Shepard, L. A. (1989). Inflated test score gains; Is it old 
norms or teaching the test? Technical Report. Los Angeles, 
CA: UCLA Center for Research on Evaluation, Standards, and 
Student Testing (Grant No. OERI-G-86-0003) . March. 

Southern Regional Education Board. (1984). Measuring 
educational progress in the South: studen t assessment . 
Atlanta, GA: Southern Regional Education Board. 

Gtonehill, R. M. (1988) . Norm-referenced test gains may be 
real: A response to John Jacob Cannell. Educational 
Measurement: Issues and P ractice . 1, No. 2, 23-24. 

Williams, P. L. (1988). The time-bound nature of norms: 
Understandings and misunderstandings. Educational 
Measurement: Issue s and Practice . 2, No. 2, 18-21. 

Wiser, B. & Lenke, J. M. (1987). The stability of achievement 
test norms over time. Paper presented at the annual meeting 
of the National Council on Measurement in Education. 
Washington, DC: April, 1987. 



57 



Yen, W. M., Green, D. R. , & Burket, G. R. (1987). Valid 
normative information from customized tests. Educational 
Measurement: Issues and Practice , No. 1, 7-13. 



ERIC 



62 

. 1 



58 



Table 1 

Definitions of Stratification Variables Used to Sample 

School Districts 



A. REGION. Region of the country was defined to have 4 strata. 

1. East. 

Connecticut, Delaware, District of Columbia, Maine, 
Maryland, Massachusetts, New Hampshire, New Jersey, 
New York, Pennsylvania, Rhode Island, Vermont 

2. North/Central 

Illinois, Indiana, Iowa, Kansas, Michigan, Minnesota, 
Missouri, Nebraska, North Dakota, Ohio, South Dakota, 
Wisconsin 

3 . South 

Alabama, Arkansas, Florida, Georgia, Kentucky, Louisiana, 
Mississippi, North Carolina, South Carolina, Tennessee, 
Virginia, West Virginia 

4. West 

Alaska, Arizona, California, Colorado, Hawaii, Idaho, 
Montana, Nevada, New Mexico, Oklahoma, Oregon, Texas, 
Utah, Washington, Wyoming 

B. SIZE. District enrollment, 1980 Census, 8 strata. 



1. Less than 1,200 

2. 1,200 to 2,499 

3. 2,500 to 4,999 

4. 5,000 to 9,999 



5. 10,000 to 24,999 

6. 25,000 to 49,999 

7. 50,000 to 99,999 

8. 100,000 or more 



C. SES. Community socio-economic status index bas&d on the 1980 
census. SES equals the median family income in 
thousands of dollars plus 6 times the median years of 
education of the population 25 years old or older. 
SES used to define 5 strata. The labels of the strata 
and approximate percentage of students in each are: 

1. Low (15%) 

2. Below Average (20%) 

3. Average (30%) 

4. Above Average (20%) 

5. High (15%) 



ERIC 



63 



59 



Table 2 

States with Norm-Referenced Comparisons and 
Grades Where at Least one Comparison is Available 



Grades 



State 


Basis of 
Comparison* 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


Alabama 


NRT 


+ 






+ 


4- 




4- 


4- 




4- 






Alaska 


NRT 










4- 


4- 


+ 


4- 


4- 


4- 


4- 




Arizona 


NRT 


+ 


+ 


+ 


+ 


4- 




+ 


+ 


+ 


+ 


4- 


+ 


Arkansas 


NRT 














+ 






+ 






California 


LINK 












4^ 




+ 








+ 


Colorado 


NRT 






+ 
















4- 




Delaware 


NRT 












4- 


4- 


4- 






4- 




Georgia 


NRT 




4- 














4- 








Hawaii 


NRT 












4^ 




4- 










Idaho 


NRT 






















4- 




Illinois 


LINK 






+ 




















Indiana 


NRT 


•f 


•f 


+ 






+ 




4- 


4- 




+ 




Iowa 


NRT 


-f 






-f 




4- 


4- 












Kentucky 


NRT/ LINK 


-f 


+ 


-f 


4- 


4- 


4" 


+ 




4- 


4- 


4- 


4- 


Louisiana 


NRT 








4- 




4- 






+ 








Maryland 


NRT 






-f 




+ 






4- 










Mississippi 


NRT 


4- 






+ 




4^ 














Missouri 


LINK 






-f 










+ 










Nevada 


NRT 


























New Hampshire 


NRT 








4- 








4- 










New Mexico 


NRT 






+ 










4- 










No. Carolina 


NRT 




+ 


+ 






4- 




4- 










No. Dakota 


NRT 






+ 




+ 




4- 












Oklahoma 


NRT 














+ 












Oregon 


LINK 
















4- 










Rhode Island 


NRT 












4- 




+ 










So. Carolina 


NRT 








4- 


4- 




+ 




4- 




4- 




So. Dakota 


NRT 








4- 








4- 






4- 




Tennessee 


NRT 










V 




4" 




+ 






4- 


Texas 


LINK 


+ 




+ 








4- 




+ 








Utah 


NRT 










•f 












-¥ 




Virginia 


NRT 








+ 








+ 






4- 




Washington 


NRT 








4- 








4- 




+ 






West Virginia 


NRT 












4- 






4- 




+ 




Wisconsin 


NRT 








+ 








+ 






4- 




Number 35 




10 


10 


20 


16 


13 


18 


13 


22 


11 


11 


13 


5 



★NRT = Norm-Referenced Test LINK - Equated to NRT 
NRT/LINK = Some years based on NRT and others on LINK 



o "€4 

ERIC 



60 



Table 3 

Number of States and Number of Students Contributing to 
Estimates of Percent of Students Above National Median 
by Year, Test Content, and Grade 



Reading 





X 7 O ^ 




1 QfiS 

X 7 O O 


-87 


1987 


-88 




Number 


Number 


Number 


Number 


Number 


Number 




of 


of 


of 


of 


of 


of 


Grade 


States 


Students 


states 


Students 


States 


Students 


1 


7 


281,734 


6 


271,954 


7 


302,544 


2 


8 


343,490 


7 


329,928 


7 


330,255 


3 


12 


362,239 


12 


302,893 


10 


461,152 


4 


14 


460,480 


13 


452,447 


13 


485,084 


5 


8 


242,871 


7 


209,289 


8 


226,122 


6 


10 


288,671 


10 


231,702 


11 


474,498 


7 


10 


381,570 


8 


283,334 


9 


337,862 


8 


13 


445,687 


16 


433,801 


13 


505,762 


9 


10 


250,712 


7 


244,762 


8 


351,102 


10 


8 


271,706 


10 


296,866 


8 


258,866 


11 


10 


250,732 


11 


239,223 


11 


241,956 


12 


3 


65,809 


3 


67,782 


2 


68,841 


Mathematics 


1 


7 


281,734 


6 


271,954 


7 


302,544 


2 


8 


343,490 


7 


329,928 


7 


330,255 


3 


11 


353,612 


11 


293,452 


9 


339,089 


4 


14 


460,480 


13 


452,447 


13 


485,084 


5 


8 


242,871 


7 


209,289 


8 


226,122 


6 


9 


280,053 


9 


222,886 


10 


364,093 


7 


10 


381,570 


8 


283,334 


9 


337,862 


8 


13 


445,687 


15 


424,959 


12 


396,574 


9 


7 


300,728 


7 


244,762 


8 


351,102 


10 


8 


271,706 


9 


287,457 


8 


258,866 


11 


10 


250,712 


11 


239,223 


11 


241,956 


12 


3 


65,809 


3 


67,782 


2 


68,841 



61 



Table 4 

Number of States and Number of S\:udents Contributing to 
Estimates of Percentile Rank of State Means or Medians 
by Year, Test Content, and Grade 



Reading 





1 Q Q Cv 




1 QO A< 




1987 


-88 




Numcer 


MUIRDeir 












of 


of 


of 


of 


Of 


Of 


Grade 


States 


students 


states 


students 


States 


Swuaents 


1 


5 


250,628 


5 


264,972 


6 


295,840 


2 


6 


308,342 


6 


323,318 


7 


385,391 


3 


11 


623,579 


12 


336,372 


12 


394,641 


4 


11 


389,954 


12 


446,642 


13 


509,839 


5 


7 


206,325 


8 


250,586 


11 


336,191 


6 


8 


526,312 


8 


245,215 


11 


391,526 


7 


8 


317,994 


8 


281,849 


11 


401,015 


8 


11 


403,406 


16 


471,619 


14 


468,180 


9 


6 


295,903 


6 


239,606 


8 


348,617 


10 


6 


236,868 


9 


291,311 


8 


253,699 


11 


9 


246,555 


10 


234,746 


10 


237,583 


12 


3 


276,030 


2 


65, 120 


2 


68,841 



Mathematics 



1 


5 


250,628 


5 


264,972 


6 


295,840 


2 


6 


308,342 


6 


323,318 


7 


385,391 


3 


11 


623,579 


12 


336,372 


12 


394,641 


4 


li. 


389,954 


12 


446,642 


13 


509,839 


5 


7 


236,325 


8 


250,586 


11 


336,191 


6 


8 


526,312 


8 


215,215 


11 


391,526 


7 


8 


317,994 


7 


244,332 


11 


401,015 


8 


11 


403,406 


16 


471,619 


14 


468, 180 


9 


6 


295,903 


6 


239,606 


8 


348,617 


10 


6 


236,868 


8 


253,671 


8 


258,722 


11 


9 


246,555 


10 


234,746 


10 


237,583 


12 


3 


276,030 


2 


65,120 


2 


68,841 



ERIC 



62 



Table 5 

Changes in District Percents of Students Above the 
Nation».l Median With Increasing Age of Norms 







Mean 


Mean Change in 


Estimated 


Mean 






Age 


Percent 


Above 


Change (Old Minus 




Number 


of 


Median ] 


Per Year 


Current Norms) 




of 


Norms 




















Grade 


Districts 


(Years) * 


Reading 


Math 


Reading 


Math 


1 


46 


4.7 


1.3 


1.7 


6 


8 


2 


63 


4.8 


1.0 


1.9 


5 


9 


3 


73 


5.1 


1.2 


1.7 


6 


9 


4 


70 


4.3 


1.3 


1.4 


6 


6 


5 


73 


5.2 


1.4 


1.9 


7 


10 


6 


69 


4.5 


1.0 


".3 


5 


10 


7 


61 


4.8 


0.5 


2.2 


2 


11 


8 


70 


5.1 


1.7 


2.2 


9 


11 


9 


49 


4.7 


0.5 


2.3 


2 


11 


10 


42 


4.7 


-0.3 


1.1 


-1 


5 


11 


42 


5.0 


1.1 


2.3 


6 


12 


12 


14 


5.4 


0.2 


1.2 


1 


6 



* Mean age of norms is the average number of years between the 
date of test administration and the date that the norms used to 
report district results were collected by the publisher. 



63 



Table 6 

Estimated Changes in National Percentile Rank of 
Achievement Scores at the National Median at 

One Point in Time 

1. Reading Achievement 



Source/Years Being Compared 









ITBS^ 


MAT* 


SRA^ 


Stanford^ 


NAEP^ 




77 


81 


77-8 


77-8 


78 


81-2 


74-5 




to 


to 


to 


to 


to 


to 


to 


Grade 


84-5 


87 


84-5 


84-5 


83-4 


86 


83-4 


1 


28 


7 


9 


20 


-3 


11 




2 


14 


10 


12 


5 


1 


4 




3 


12 


2 


11 


13 


1 


6 


3 


4 


11 


8 


12 


5 


-1 


2 




5 


14 


5 


11 


7 


2 


2 




6 


11 


8 


12 


6 


-3 


2 




7 


16 


6 


11 


9 


-2 


2 


0 


8 


11 


5 


10 


7 


-4 


1 




9 


15 






9 


2 


3 




10 


8 






•»5 


2 


0 




11 


4 






-3 


-2 


4 


2 


12 


1 






-5 


-7 


3 





2. Mathematics Achievement 





77 


81 


77-3 


77-8 


78 


81-2 


77-8 




to 


to 


to 


to 


to 


to 


to 


Grade 


84-5 


87 


84-5 


84-5 


83-4 


86 


85-S 


1 


16 


18 


3 


12 


10 


15 




2 


14 


22 


5 


9 


3 


10 




3 


13 


13 


5 


15 


-6 


9 


4 


4 


11 


14 


9 


7 


-2 


8 




5 


13 


17 


8 


11 


3 


8 




6 


13 


17 


8 


10 


0 


7 




7 


15 


15 


10 


2 


1 


6 


5 


8 


18 


11 


10 


5 


0 


7 




9 


14 






0 


1 


4 




10 


8 






4 


4 


4 




11 


5 






7 


-2 


4 


0 


12 


2 






6 


-4 


5 





o 

ERIC 



G5 



64 



Footnotes for Table 6 

Differences in California Achievement Tests (CAT), Form E (1984- 
85 norms) percentile ranks and corresponding CAT, Form C 
(1977 norms) percentile ranks of 50 (CTB/McGraw-Hill, 1987, 
Table 38, p. 3-35) . 

Differences in Comprehensive Tests of Basic Skills (CTBS) , Form 
U percentile ranks in 1981 and those required to have a 
percentile rank of 50 on the CTBS in 1987 (based on November, 
1988, CTB-McGraw-Hill press release, "CTB/McGraw-Hill studies 
Show Students Achieving at Higher Levels in Basic Skills", see 
also, Rothman, 1988, p. 20). The 1987 norms are estimated 
from weighted user data. 

Differences in Iowa Tests of Basic Skills (ITBS) , Form G (1984- 
85 norms) percentile ranks and corresponding ITBS, Form 7 
(1977-78 norms) percentile ranks of 50 (Hieronymus & Hoover, 
1986, Table 6.31, p. 153). 

Differences in Metropolitan Achievement Tests (MAT6) , Survey 
Forms L and M (1984-5 norms) and corresponding MAT, Forms J 
and K (1977-78 norms) percentile ranks of 50 (Psychological 
Corporation, 1988; Prescott, Balow, Hogan, & Farr, 1978; 
1986) . 

Differences in SPA Achievement Series, Forms 1 and 2 (1983-84 
norms) percentile ranks and corresponding SRA Achievement 
Series Forms 1 and 2 (1978 norms) percentile ranks of 50 
(Science Research Associates, 1979; 1986). 

Differences in Stanford 7 Plus (1986 norms) percentile ranks and 
corresponding Stanford Early School Achievement Test, 2nd 
edition; Stanford Achievement Test, 7th edition, and Stanford 
Test of Academic Skills (TASK) , 2nd edition (1981-82 norms) 
percentile ranks of 50 (Gardner, Madden, Rudner, Karlsen, 
Merwin, Callis, & Collins, 1983; 1987). 

Differences for the National Assessment of Educational Progress 
(NAEP) are based on age (9, 13, and 17) rather than grade (3, 
7, and 11) cohorts. For reading, the differences are between 
the 1983-84 assessment percentile ranks and the corresponding 
1974-74 assessment percentile rank of 50 (NAEP, 1985) . For 
math, the differences are between the 1985-86 assessment 
percentile ranks and the corresponding 1977-78 percentile rank 
of 50 (NAEP, 1988; frequency distributions provided by 
Beaton) . 



ERIC 



65 



Table 7 

Estimated Standardized Average Changes in Achievement Test Scores 
on the Stanford from 1982 to 1986 (Based on Wiser a Lenke, 1987) 



Reading 



Mathematics 



Grade 


Total 
Group^ 


1986 
Non-users" 


Total 
Group 


1986 
Non-users 


1 


. 17 


.10 


.34 


c 


2 


.13 


.04 


.18 


.10 


3 


. 13 


.12 


.15 


. 12 


4 


.03 


-.01 


.12 


. 12 


5 


.03 


-.02 


. 17 


.16 


6 




-.02 , 


.10 


.06 


7 


.035 


.032 


.08 


.06 


8 


.00^ 


-.085 


.10 


.11 


9 


.08^ 


.03^ 


.05 


.07 


10 


.05 


.05 


.04 


.03 


11 


.10 


.11 


.03 


.05 


12 


.13 


. 14 


, 05 


.08 



^The mean for the full 1986 norming sample (users and non-users) 
minus the 1982 mean all divided by the 1982 standard 
deviation. 

^The mean for the 1986 non-users only minus the 1982 mean all 
divided by the 1982 standard deviation. 

^Not available. 

^Reading Comprehension. 



ERIC 



70 



66 



< 

o 
u 

-J 
< 

o 
< 

> 
o 
m 
< 



o 

Ui 




5 6 7 8 

READING 



9 (0 I I 12 



100 



90- 
80- 
70- 
6o4 
50A 
4oA 
30 -j 

10-1 

I 



i 



2 3456789 
MATHEMATICS 

1985-86 CZ! 1986-87 O 




Figure 1 

Percent of Students Scoring Above National Median 
Based on States Reporting ^Weighted by Number of Students) 



ERIC 



71 



67 



GRADES 1-6 



UJ 

o 
tr 

Q. 



100 
75 
50 





I 


• 

1 


1 




— 1— 


-4- 




# 

1 


! 


1 


• 

1 


1 


1 



0 



3 

GRADE 



4 



U 

tr 

UJ 
CL 



100 
75 
50 
25 
0 



i 



GRADES 7-12 



i 



t 



8 9 

GRADE 



10 



t 



1 1 



12 



Figure 2 

Percent of Students Reported by States to be Scoring 
Above the National Median in Reading (Each Point 

Represents a State) 



ERIC 



7^ 



68 



GRADES l~6* 



100 

LU 

^ 50 
Ixl 

Q- 25 





100 




75 


O 

cc 


50 


LU 
Q. 


25 




0 



s 



JL 



I 



1 



1 



2 3 4 

GRADE 

GRADES 7-12* 



t 



# 



T 



I 



8 9 
GRADE 



10 



I 



• 



I i 



t 
t 



12 



Figure 3 

Percent of Students Reported by States to be Scoring 
Above the National Median in Mathematics (Each Point 

Represents a State) 



ERIC 



73 



100 




< 

UJ 
-J 



90 
80 
70 
60 
50 
40 
30 
20 
10 
0 



12 3 4 



5 u 7 8 

READING 



9 10 If 12 



Ui 

o 
q: 

Ui 



100-| 
90- 
80- 
70- 
60 H 
5oJ 
40H 
30-| 
20n 

loH 

0- 



I 



2 



3 



4 



T 
5 



6 



7 



8 



9 



iO 



If 



12 



MATHEMATICS 
l985-ft6 C3 1986-87 



1987-88 



Figure 4 

Weighted Mean of State Percentile Ranks 



GRADES 1-6 



70 



100 
2 T5 

UJ 

o 50 
q: 

UJ 25 
a. 





• 

• 


• 1 

-J 1 


i } 1 






• 

I 


1 1 


• 

1 1 


I 



0 



3 4 

GRADE 



100 

UJ 

o 50 
cc 

0 



I 

1- 



GRADES 7~I2 



i-H— ^ 



1 



8 9 
GRADE 



10 



i 



i I 



(2 



Figure 5 

State Median Percentile Rank or Percentile Rank 
of state Mean Test Score in Reading (Each Point 

Represents a state) 



ERIC 



75 



71 



GRADES 1-6 



ERIC 



100 r- 
2 75H 
O 50 

0 



I 

I 



I 



1 



I I I 



1 



i 



2 3 4 

GRADE 



GRADES 7- 12 



lOOr- 
75 



LjJ 

o 50 
ct: 

liJ 25 
o. 

0 











1 


-i— ^ 


1 • 


• 

— ^ 










I 


1 1 


1 


\ 



8 9 
GRADE 



10 



I I 



12 



Figure 6 

State Median Percentile Rank or Percentile Rank 
of State Mean Test Score in Mathematics (Each Point 

Represents a State) 



7C 



72 



lOO-r 

2 90- 

< 

Q 80- 
ijj 



S 70- 




GRADE 



M READING □ MATHEMATICS 



Figure 7 

Estimated Percent of Students Scoring Above National Median 
Based on District Results Weighted by Region, District 

Size, and SES 



77 

ERIC 



73 



:oo 

90 
80 
70 
~ 60 
g 50 

o 

q: 40 

UJ 
CL 



UJ 



30- 

20 -4;$ 



10 — 



I 2 3 4 5 6 7 8 

GRADE 

H READING □ MATHEMATICS 



10 II (2 



Figure 8 

Means of District Percentile Ranks Weighted by Region, 

District Size, and SES 



ERIC 



74 



Figure 9 

Stera-and-Leaf Distribution of the District Percents of Students 
Scoring Above the National Median at Grade 3 



Reading 




Mathematics 




Stem 


Leaf 


Count 


Stem 


Leaf 


Count 


9 . 




0 


9 : 




0 


9 : 


: 34 


2 


9 : 


; 123 


3 


8 I 


. ceo 

\ 558 


3 


8 : 


: 7899 


4 


8 . 


: 12 


2 


3 : 


\ 012224 


6 


7 : 


1 56799 


5 


7 : 


: 83 


2 


7 : 


: 0122344 


7 


7 : 


: 000112244 


9 


6 ; 


; ^niniZ9 




6 ; 


; 556778888899 




6 : 


: 00111224444444 


14 


6 : 


; 000123344444 


12 


5 : 


: 5566677899 


10 


5 ! 


: 55567788999 


11 


5 I 


: 001233344 


9 


5 : 


: 1222333444 


10 


4 : 


: 556889 


6 


4 I 


; 55^667899 


9 


4 : 


: 001223 


6 


4 : 


: 0224 


4 


3 : 


: 69 


2 


3 : 


; 69 


2 


3 : 


: 012223344 


9 


3 : 


; 334 


3 


2 : 


: 89 


2 


2 : 




0 


2 : 


: 14 


2 


2 ; 


; 0 


1 


1 : 


: 5 


1 


1 ; 




0 


1 : 




0 


1 : 


: 1 


1 



P90 
P75 
P50 
P25 
PIO 



78 
67 
58 
45 
32 



P90 
P75 
P50 
P25 
PIO 



82 
70 
61 
52 
42 



ERIC 



70 



7S 



85 
< 80 
O 75 

UJ 

2 70 
^ 65 
g 60 
55 

(n 
z 

45 

o 

^ 40 
35 

ll. 

O 30 

20- 



O 

q: 

UJ 

a. 



15 



J L_l \ \ L 



\ I L 



I 2 3 



4 5 6 7 8 
GRADE 



9 10 II 12 



90 



^75 
P50 



25 



(0 



Figure 10 

Box-and-Whisker Plots Showing the Percent of Students Reported 
to be Above the National Median in Reading by Grade for Districts 
at the 10th, 25th, 50th, 75th, and 90th Percentiles of the 

District Distributions 



ERIC 



76 



90 



E 85 



< 
o 

ixl 



80 



uj 75 
> 

O 70 
< 65 
^ 60 

§ 50k-T- 
Ij^ 45- 

I- 35[- 



lu 30 
o 

oc 25 

IJLl 

a. 



1 



P 



90 
'75 



50 



25 



ID 



J I i L 



I I I 1 I I L 



12 3 4 



5 6 7 8 
GRADE 



9 10 II 12 



Figure IX 

Box-and-Whisker Plots Showing the Percent of Students Reported 

to be Above the National Median in Mathematics by Grade for 
Districts an the lOth, 25th, 50th, 75th, and 90th Percentiles 

of the District Distributions 



ERIC 



Si 



77 



PERCENT ABOVE 
MEDIAN BY YEAR 




82-83 84-85 86-87 



STATE C: PERCENT 
PASSING BY YEAR 




79-80 82-83 85-86 
YEAR 

□ READING O MATH 



87-88 



PERCENTILE RANK OF 
STATE MEAN BY YEAR 



^ 80 




8t-82 83-84 85-86 



PERCENTILE RANK OF 
STATE MEAN BY YEAR 




76-77 79-80 82-83 85-86 
YEAR 



Figure 12 

Trends in Reading and Mathematics Achievement for Four States 



90- 
80- 



70- 




12 3 4567 89 10 II 
MATHEMATICS 



OLD 85-6 CU OLD 86-7 CD NEW 87-: 8 



Figure 13 

Percent of Students Above National Median for District A 
Before and After a Change of Test Editions 
(New Edition in 1987-88) 



BEST COPY AVAILABLE 



79 




23456789 
READING 



10 II 




2 34 567 8 9 10 II 

MATHEMATICS 

OLD 85-6 a OLD 86-7 □ NEW 87-8 



Figure 14 

Percent of Students Above National Median for District B 
Before and After a Change of Test Editions 
(New Edition in 1987-88) 



ERIC 



84 



80 




12 3 4 5 6 7 

GRADE 

urn CAT a CTBS EZ3 ITBS 
MAT KS3 SRA CD SAT 



Figure 15 

Estinated Yearly Changes in Reading Percentile Rank: 
Publisher Results at the Median 



ERIC 



BEST eOPY AVAILABLE 




< 2 3 4 5 6 7 

GRADE 

mm CAT cm CTBS ZZ2 ITBS 
B MAT m SRA CU SAT 



Figure 16 

Estimated Yearly Changes in Mathematics Percentile Rank; 
Publisher Results at the Median 



86 



82 




< 

q: 
-1 



LiJ 
O 

q: 
a. 



Ui 

< 

X 

o 



(6 
14 
12 
10 
8 

6 
4 
2 
0 
-2 
-4 
-6 



74 


76 


78 80 82 


84 86 


75 


77 


79 81 83 


85 87 






READING 










MAT 








/ CTBS 








/CAT/ 








y/STAN 
















^ NAEP 








SRA 


, I 1 


! 


t 1 i 1 ( 1 1 




74 


76 


78 80 82 


84 86 


75 


77 


79 81 83 


85 87 






MATHEMATICS 





Figure 17 

Estimated change at the Median in National Percentile R?nks of 
Achievement Test Scores at Grade 3 (NAEP, Age 9) 



ERIC 




o I I ' I i t ' { I Lj I — I — L 

74 76 78 80 82 84 86 

75 77 79 81 83 85 87 



READING 











< 


16 


cr 


14 


LE 


12 




10 




8 




o 


6 






LU 


4 


a. 


2 




0 


LJ 


-2 


O 






-4 


HA 


-6 


o 













CAT CTBS 










yiTBS/ 










^V^**^ STAN 










^ / NAEP/ 










""^^ii^MAT 










5^^^A 




i f 


f 1 




! . t . 1 


74 


76 


78 


80 


82 84 86 


75 


77 


79 


81 


83 85 87 



MATHEMATICS 



Figure 18 



Estimated Change at the Median in National Percentile Ranks 
Achievement Test Scores at Grade 7 (NAEP, Age 13) 



84 




74 76 78 80 82 84 86 

75 77 79 81 03 85 87 

READING 



< 
q: 

LJ 

-J 

LU 
U 

q: 

Ui 



LlJ 

< 



16 
14 
12 
10 
8 

5 
4 
2 
0 





















^ MAT 










^CAT 




















NAEP 










SRA 


f . 


1 


[ f f 1 ( 


1 1 


i . 1 


74 


76 


78 80 


82 


84 86 


75 


77 


79 81 


83 


85 87 






MATHEMATICS 






Figure 19 

Estimated Change at the Median in National Percentile Ranks of 
Achievement Test Scores at Grade 11 (NAEP, Age 17) 



Appendix A 

Sample Letter and Data Collection Form For 
Directors of State Testing Programs 



9(1 



A-l 



July 22, 1988 







m. 


np.SKTOPj 




Par;* -MOT 




DPfiKTOPi 




Par ft -NOT 




JieSKTOPJ 






m 


DPSKTngJ 




n*f * -MOT 


ON 


DP^ltTOPJ 








ON 


np.ftffTftPl 



Dear n^i-^^MnT hm nr,<^KTr^pJ; 

We seek your assistance in a study that is being conducted by the Center for 
Research on Evaluation, Standards, and Student Testing (CRESST) on behalf of 
the Office of Educational Research and In^Jrovement (OERI) . This study was 
sitmulated by the report "Nationally Normed Elementary Achievement Testing in 
America's Public Schools: How All Fifty States Are Above Average- by Dr. John 
J. Cannell. As you know, this report attracted considerable attention in the 
press and has been of great interest at OERI and among those concerned about 
the assessment of educational achievement, 

Cannell 's findings and conclusions ate both provocative and controversial. 
The interpretation of normative comparisons was called into question by 
Cannell's finding that "no state scores below the publisher's 'national norrr.' 
at the elementary level on any of the six major nationally normed, 
commercially available tests** (p. 2 of second edition of Cannell Report) . The 
value of assessment results was further challenged by Cannell 's conclusion 
that -standardized, nationally normed achievement tests give children, 
parents, school systems^ legislatures, and the press misleading reports on 

achievement levels- (p. 6 of special issue of gdugaLional Measurements I^iUi^ 

and Practice. 1988, Vol. 7, No, 2). 

Given the importance that is attached to student achievement and the 
widespread use of normative comparisons, Cannell 's findings and conclusions 
deserve close scrutiny. We need to have a better understanding of the 
magnitude and prevalence of the apparent?^ high achievement results reported 
by Cannell. We also need to have a better understanding of the factors which 
may contribute to and eicplain the findir gs. 

To achieve these goals* we need your hel,-> in collecting information that will 
provide a better data base for determinirg not only what proportion of 
students score above determining not only what proportion of students score 
above the 50th percentile according to national norms, but other important 
characteristics of the test results such a.f hanges in means over time and the 
variability in scores. We also need to obtain information on the way in which 
test results are currently used (e.g., publ.'c reporting, grade retention, 
school incentives, etc.)* when these uses wete instituted, and planned changes 
in the use of test results. Finally* we are seeking information about 



ER?C 9i 



policies regarding test security and guidelines on preparation of students for 
taking tests. 

A CRESST Staff member will be contacting you by phone to seek your assistance 
and to arrange for a time for a phone interview with an appropriate person on 
your staff. The information that will be requested is outlined on the 
enclosure • We will send you more detailed worksheets between now and the time 
of the telephone interview to help organize the requested information. 

In many cases, the information that we are seeking may be provided in reports 
that have previously been prepared. Thus we request that you send us copies 
of any reports that give summaries of district results that have been 
published within the past three years. Copies of press releases and newspaper 
articles about the test results would also be useful • If you send us reports 
and press releases as quickly as possible^ we will use the reports to extract 
as much of the requested information as possible. We will call you to ask 
questions after we have "done our homework" . 

Please send reports to: Robert L. Linn 

School of Education 
CeuT^us Box 249 
University of Colorado 
Boulder, CO 80309-0249 

Thank you for your consideration- We will phone you within the next two weeks 
to answer cjuestions and to try to arrange a time for a telephone interview. A 
return postcard is enclosed so that you can indicate the name, phone number, 
and best times for us to try to contact the appropriate person for the 
telephone interview. 

Sincerely, 



Eva L. Baker 
UCLA 

Co-DirectorSr Center for the Study 
Standards, and Student Testing 



Robert L. Linn 

University of Colorado-Boulder 
of Research on Evaluation 



A-3 

Explanation of Information Requested 



Cntumn Information rsqUCSlC^ 

1 Testing year 

2 Grade levels tested K - 12. 

3 Name of test used for statewide assessment e.g.. CTBS. MAT. name of 
locally developed test. 

4 Edition of the test used at each grade level, e.g.. 1982. 

5 Form of the test used at each ^rade level. 

6 Year when test was first used. 

7 Norming year of test used for reponing scores, 
g Month in which tests were administered. 

9 Type of scores reported, e.g., percent correct, percentile rank. NCE. 



n.b. If you have more than one type of score, please provide one form 
of data in the preferred order as follows: 

Percentile Rank 
Grade Equivalents 
NOE 

Stanines 
Percent Correct 

10 Number of students enrolled: the total number of students by grade 
statewide. 

1 1 Number of students tested. 

12 Number of students* scores reported: If not all scores arc used to 
compute rankings or other statewide test results, enter the number of 
students' scores used to compute the achievement data. 

12 Reading %: The percent of students scoring above the national SQlh 
percentile statewide. 

14 Math %: T he percent of students scoring above the, national 5()ih 

percentile Statewide. 

n.b. If neither r eading nor math data requested in 12 and 13 are available.. — alMSfi 

prtivide the most appropriate composite scores and indicate the nature of these 
on the form. 



A-4 



If the da.a requested in columns 13 or 14 (percent of students scoring above the 
national 50th percentile) are not available, please provide as much of the following 
as possible (columns 15 • 20 on the Alternate Information Sheet): 

Column 



15 Reading statewide mean. 

16 Reading statewide stmdard deviation. 

17 Math statewide mean. 

18 Math statewide standard deviation. 



19 Reading score at each percentile. The score at the 25th 

percentile statewide. 

- at the 50ih percentile statewide. 

- at the 75th percentile statewide. 

20 Math score at each percentile: The math score at the 25th 

percentile statewide. 

- at the 50th percentile statewide. 

- at the 75ih percentile staicwice. 



Type of scores: If the type of scores reported in columns 13-20 arc not 
the same as those indicated in column 9. please indicate the type of 
scores used to compute the percentiles, mean, and standard deviation^; 



ERIC 



9„ S 



statewide Testing Information 



A.5 



State Name 



Person Supplying Information 



Title 



6 



8 



Testing Year 



Grade 



Test Name 



Edition 



Form 



Year First 
Used 



Normir^g 
Year 



Testing 
Dates 



Type of 
Scores 



1985-1986 



1986-1987 



K 



1987-1988 



1985-1986 



1986-1987 



198) -1988 



1985-1986 



1986-1987 



1987-1988 



1985-1986 



1986-1987 



1987-1988 



1985-1986 



1986-1987 



1987-1988 



1985-1986 



1986-1987 



1987-1988 



1985- 1986 

1986- 1987 



1987-1988 



A-6 





















198S-1986 


















1986-1987 


7 
















1987-1988 




































1985-1986 


















1986-1987 


8 
















1987-1988 




































1985-1986 


















1986-1987 


9 
















1987-1988 




































1985-1986 


















1986-1987 


10 
















1987-1988 




































1985-1986 


















1986-1987 


1 1 
















1987-1988 




































1985-198S 


















1986-1987 


12 
















1987-1988 



















ERIC 



9C 



A. 7 



PI*M» Rtftr to Explanation of information h»«)uaated - Attactie'j 



10 1 1 12 13 14 



Number of Students 

cnronoQ 


Number of Students 


NumtMf of Students' 


Readir^: % of Students 


Math: % of Students 





































































































































































































































































































o 

ERIC 



Alternate Information Available 



A-8 



State Name Person Suppfying Information 



IS 16 17 18 19 20 





Mean 


Standafxf 
Deviation 


niiain 
Mean 


Standard 
Deviation 


at ea 

25 


ch percc 
SO 


mtile 
75 


at ea 
25 


Dh percc 
SO 


mtile 
75 


1985*1986 






















1986-1987 K 






















1987-1988 












































1985-1986 






















1986-1987 1 






















1987-1988 












































1985-1986 






















1986-1987 2 






















1987-1988 












































1985-1986 






















1986-1987 3 






















1987-1988 












































1985-1986 






















1986-1987 4 






















1987-1988 












































1985-1986 






















1986-1987 D 






















1987-1988 












































1985-1986 






















1986-1987 6 






















1987-1988 













































ERIC 



A-9 

























1 985-1 9p6 












































1 QR7-1 988 












































1 985*1 98o 






















tOAA.IQA? A 
























































































1 90Q* 1 90 f 






















1 QR7. 1 Qfifi 





























































1 








1 986-1987 1 0 






















1 987-1 988 












































4 AOC t OO A 






















1 986-1 987 1 1 






















1 ^^87-1 988 


































































i 086-1987 1 2 






















1 987-1988 



































































• 



ERIC 



95) 



Appendix B 
Interview Guide 



10(t 



code 



ERIC 



District 
State 



Interviewer 



date 



?erson(s) Interviewed 



name name 



title title 



Background Information: Nu^*:ber of schools in district 

Size (range) 



Center for the Study ^f Research on Evaluation^ Standards^ and Student Testings 
Robert L» Linn» School of Education^ University of Colorado at Boulder 



101 



B-2 

Part I: District Testing Data (to be recorded on the fornrs provided) 

YEARS 1. Are districtvide test resul:3 available for: 
TESTED 

1987-33 

1986-37 

1985-86 If none, then the most recent year: 



If there is no districtwide testing, ask only 12, 13, 19 - 22, and 26 for large 
districts^ 

ENROLDIENT 2* U'hat is the basis for the enrollment figures used to give the 
Basis number of students in each graded (e^g* , ADA* Average Daily 

Attendance) 



ENROLLMENT 3* What office provides the enrollment figures? 
SOURCE [name of person and phone number if easily available] 



TESTED * 4. Is the tiumber of students tested the same as the number 

REPORTED of students that are included m the reported test results? 



Yes No 



If no, how does the number included in the r-^ported test results 
differ frots the number tested? 

probe: special education 



ERIC -^^^ 



B-3 



S.\M?Li:.G 
PLAN 



5. Were ail eligible students in the grade tested or is a 
sampling plan used? 

universal testing by grade _ sa.-npling plan 

Please describe any sampling procedures used* 



TESTING 
ZXCLUSIONS 



5^ What rules are used to determine students who are excluded 
from testing? 

request : copies of any written policies that describe these rules 



EXCLUDED 



How many students ( :r what percent of the students) are 
excluded using these rules? 



MAKE-UP 8. What are the policies for make-up testing (for students who 
TuSTING are absent)? 

request i if in writing 



1(^3 



I 



B-4 



[\s<<. ZTia following 22^2. needed:] 

L*~AL1.V 9. I: a specially CDns:ructed cesc is usei, is iz linked to a 
Cr''iT?l*CTID nori-referencsd ces:.' I: so, waat is the nane and edition of tr.e 
TIS7 nor-n referenced test"* 



..wrrjNAL 
:o. iPA.'jiso.-.'s 



If ttie percent of students above tne 50th percentile is 
unknown , please describe tne way in which scores are reports: 
ana coniDarisons are :nade co the national norn. 



LXAL 
FACTORS 
U TEST 

sconz3 



11. Arr :anv factors of schools or the characteristics of their 
students taken it.-o account in reporting test scores? 

(e.;., percent minority, percent eligible for free lunch. Chapter I) 



ERIC 



104 



B5 



[ BEGIN' Tk?Z ?.zZOJDViG ] 

Pare II; Testina Policies a.-^d ?ercB?!:i:ns 



uSZS AaD 12 • -'hac are the uses o: tes: resales? 

lAPQRT.VACZ 

-local district and school instructional and evaluation decisions; 



-reporting to parents about individual student ;^ro$ress or school 
Drocrans 



-School 3oard attention (Ana if so, how have Board ne:nbers used test 
results — to increase testing prograns or other forris of 
accountabilitv? 



'State or local politician use of scores in campaigning or proposing 
leaislacion 



-cnanging general funding levels for schools 



-targeted funds or mandatin^j proc^rans such as rsnediation 



-superintendent » principal, or teacher performance rating or jobs 



-media cnvera^^e and comnuntty awareness 



How important are test scores in /our district? 

/ / / / / 

extremely very tnoderateiy sli.tntly not important 



105 



B-6 



RZFORIIS 12. Have major educational rrforns ^esn ln:roduc2d in your 

district in tne past five years? 

recuesc: Would you briefly describe t.-^ese or send us written 
descriptions t.^at are available? 



TEST U. ^vho selected the standardized test(s) being used? (If locally 

ScLECTIOM developed, how was the content selected?) 

probe: committee composition, e.g., teachers, parents,... 



CURRICULUM 15. Have there been efforts to assure that the curriouium and the 
ALIG:iMENT test are aligned? 

If so, please describe those efforts. 



ERIC 



* let; 



B-7 



t:::z dn 

5?~CIr IC 
OBJICTIVES 



15. Do you chink tnac teachers spend :nore tine caachina the 
specific objectives on the test(s) than they would i: the tests 
were not required? 



How much :^ore tine? 



IMPORTANT 17. To what extent do you think important objectives are ziven 

OBJECTIVES less tine or emphasis because they are not included on the 

GIVEN LESS TIME test? 

'♦hat kinds of objectives are neglected? 



INFORMAL 18. Do you or members or your staff provide inf jnal guidelines 

GUIDELINES about test preparation? What kind of advice do you give 

ABOUT TEST schools about how to prepare students to take 

PREPARATION tests? 



probes: 

length of tine to practice 

minimum and raaxitnurn recommended time for practice 
whether to use items in a specific format for practice 



107 



B.8 

TECfNICAL 19. '.ihat .<ind of cecnr.icai assistance or -aterials do vou 

ASSISTANCE provide zo scnools about test preparation? 

A30UT TEST 

request : Would you serid us copies o: the izacerials or descriptions 
of tna assisca.ice? 

probes: 

practice tests 
testwiseness packages 

curriculum doaaia materials but not specific test items 
anjount of these activities 



TYPICAL 20. Can you describe typical practices of test preparation? 

PRACTICES OF 
TEST PREPARATION 

probes: 

If they say. one school dees X, ask how common this is. or how snany 
other schools do the same. 

Do schools u^e the mati 'ials and assistance you provide? 
What else do they do beyond what yoa recotrwend? 



I::T?.S:!£ 21. can you describe e>:tre-.e cases of test prepararion? 

PRACTICES :r 
TEST ?re?axa::om 

probes: 

If they describe a worse case, ask wnat they would think or as a best 
case, (as well as what is more typical, above) 
Examples of cases which violate your raccnunendations? 



TEST ADMINISTRATION 22. Do you have written policies regarding test 
A.\'0 SECURITY administration and security procedures? 

POLICIES If nor, do you have informal guidelines? 

request : written policies 



10;i 



ERIC 



I 



B 



*^'H0 23. "ho adnir.iscers tne tests? 

ADriI!iI5TH?S lo teachers in so^e schools have copies of the 

OR HAS TESTS tests prior to test adninistration? 

:r knows nsTs 

He faniiliar are teachers with the specific items on the 
tests? 

probes: 

teachers adininistering sane test over vears 
principals ^r teachers ."laving test files 



DETECT 24, Do you have any formal procedures for detecting anomalies in 

A,\*0riA-Ii:5 tne data? 

request copies 

probes: 

check for missing test booklets 

computer aetect^^on of significant numbers of erastf^3 

" of extraordinary gains frcn; ^ne >ear to the 

next 

check numbers of students tested against enrollment 




liu 



I 1 



B.Ii 



TYPICAL AND 25* Can you give examples o: Dotn nypical anc exzrene cesciag 



Have you wlthneXd score r^poris because of suspected cheatin:'^ 
probes: 

good practices: consistent » succ-?«rsful rnake-up testing 

examples of cheating- 

teachers filling in answers 
extendin; time limits for tests 
teaching specific ite-ns the te:»t 
discrepancies in nunaers of students tested 



[Ask the following only In districts designated as 7's or S's- large districts] 



EXTHEIIE 
PRACTICES 



practices? 



REACTIONS 
TO CANNELL 
REPORT 



26. -vhat are your reactions to the Cannell report and its 
conclusions? 



ERIC 



lU 



FACTORS IN' 27, What do you think are tr.e pri.-ar/ facrors tnat contribute to 
ACHIEVEMENT the recent trends in achifenent test scores in vour 

TRENDS district? 

probes: 

educational reforms 

nor:as (unrepresentative or old) 

pressure on teachers to have hijh scores 



Q 112 



ERIC 



B-13 



Closing: 

When finishing and thanking tnem for their tine, review the things wnich you ma; 
have requested in writing. 



Checklist of Requested Written Information 

testing data on years not yet recei^-ed (e.g., all three years 1985-1985) 

testing data such as distribution measures 

^3- name and phone of office or person with enrol Iraent figures 

#6- I?ules for testing exclusions 

i#8- Policies for make-up testing 

#13- Educational reforms i.i the state 

^^^^^ #i9- Technical assistance or naterials for test preparation 

#22- Test administration and security policies 

#24- Procedures for detecting anomalies 



The address for mailing is: 

Dr. Robert Linn 303-492-8230 (Bob) 

Universi:;y of Colorado or -2124 

(Nancy) 

Scnool of Education OJ" - 

(Lorrie) 
Campus Box 249 
Doulder, CO 30309 



If you have missing answers and have to schedule another call, please indicate 
that in the telephone log. 



ERIC 



113 



Appendix C 

Districts Available by Cells of Sampling Design 



114 



C-l 

Appendix c 

Number of Districts Available by Cells in Sampling Design 



Number of 
Districts 

Region District Size SES Level Available 



East Less than 1,200 Low 5 

Below Average 5 

Average 5 

Above Average 5 

High 5 

1,200 to 2,499 Low 5 

Below Average 5 

Average 5 

Above Average 5 

High 5 

2,500 to 4,999 Low 5 

Below Average 5 

Average 5 

Above Average 5 

High 5 

5,000 to 9,999 Low 5 

Below Average 5 

Average 5 

Above Average 5 

High 5 

10,000 to 24,999 Low 5 

Below Average 5 

Average 5 

Above Average 5 

High 5 

25,000 to 49,999 Low 2 

Below Average 4 

Average 0 

Above Average 1 

High 1 

50,000 to 99,999 Low 1 

Below Average 2 

Average 1 

Above Average 1 

High 0 

100,000 or more Low 1 

Below Average 2 

Average 0 

Above Average 2 

High 1 



115 



Appendix C (page 2 of 4) 



Number of 
Districts 

Region District Size SES Level Available 



North/ Less than 1,200 Low 5 

Central Below Average 5 

Average 5 

Above Average 5 

High 5 

1,200 to 2,499 Low 5 

Below Average 5 

Average 5 

Above Average 5 

High 5 

2,500 to 4,999 Low 5 

Below Average 5 

Average 5 

Above Average 5 

High 5 

5,000 to 9,999 Low 5 

Below Average 5 

Average 5 

Above Average 5 

High 5 

10,000 to 24,999 Low 1 

Below Average 5 

Average 5 

Above Average 5 

High 5 

25,000 to 49,999 Low 0 

Below Average 2 

Average 5 

Above Average 5 

High 4 

50,000 to 99,999 Low 1 

Below Average 3 

Average 2 

Above Average 0 

High 0 

100,000 or more Low 0 

Below Average 1 

Average 1 

Above Average 0 

High 0 



Appendix c (page 3 of 4) 



C-3 



Number of 
Districts 

Region District Size SES Level Available 



South Less than 1,200 Low 5 

Below Average 5 

Average 5 

Above Average 2 

High 3 

1,200 to 2,499 Low 5 

Below Average 5 

Average 5 

Above Average 2 

High 0 

2,500 to 4,993 Low 5 

Below Average 5 

Average 5 

Above Average 5 

High 5 

5,000 to 9,999 Low 5 

Below Average 5 

Average 5 

Above Average 5 

High 3 

10,000 to 24,999 Low 5 

Below Average 5 

Average 5 

Above Average 5 

High 4 

25,000 to 49,999 Low 2 

Below Average 5 

Average 5 

Above Average 5 

High 2 

50,000 to 99,999 LOW 1 

Below Average 3 

Average 5 

Above Average 5 

High 1 

100,000 or more Low 0 

Below Average 1 

Average 5 

Above Average 0 

High 1 



ERIC 



;i7 



c-4 



Appendix C (page 4 of 4) 



Number of 
Districts 



Region 
West 



District Size 
Less than 1,200 





iW CI X X G 


JuwW 


5 








5 




5 




5 




5 




5 




5 




5 


Hiah 


5 


T.OW 


5 




5 




5 


AbinvA AvAT'acT© 


5 


Hiah 


5 


Low 


5 


RaIow Averaof© 


5 


Average 


5 


Ahov^ Av^irscre 


5 


High 


5 


XrfOW 


5 


Below Averacf e 


5 


Averacje 

• * w ^1* 


5 


AHova Av^ractd 


5 


Hiah 


5 


LiOW 


2 


RaI ow AveiTACifi 


2 


• Jfc ▼ 4I« 


5 


Above Average 


5 


High 


5 


Low 


1 


Below Average 


1 


Average 


5 


Above Average 


5 


High 


1 


Low 


0 


Below Average 


0 


Average 


3 


Above Average 


1 


High 


0 



1,200 to 2,499 



2,500 to 4,999 



5,000 to 9,999 



10,000 to 24,999 



25,000 to 49,999 



50,000 to 99,999 



100,000 or inor(: 



o 

ERIC 



li6 



Appendix D 

Sample Letters, Data Collection Forms, and Questionnaires 

Sent to Districts 



no 



D-l 



August 18, 1968 









•MOT 






■ P.hnnft 




•MOT 




(Dlst. 








(CLLS£. 














J2a£A. 




Qtt DESKTQPl 








•MOT 





Dear Phono ^A^^>MnT HM nP^KToTIr 

We seek your assistance in a study that is being conducted by the Center for 
Research on Evaluation^ Standards, and Student Testing (CRESST) on behalf of 
the U.S. Department of Education's Office of Educational Research and 
Improvement (OERI) . This study was stimulated by the report "Nationally 
Normed Elementary Achievement Testing in America's Public Schools: How All 
Fif^y st^af^g Are AbovA Average** by Dr. John J. Cannell . As you may know, this 
report attracted considerable attention in the press and has been of great 
interest at OERI and among those concerned about the assessment of educational 
achievement • 

Cannell 's findings and conclusions are both provocative and controversial. 
Based on his survery of states and selected school districts^ Cannell 
concluded that •♦standardized, nationally normed achievement tests give 
children, parents, school systems, legislatures, and the press misleading 
reports on achievement levels** (p. 6 of special issue of Ednraf ional 
Measurement! Issue.^ and Practiige. 1988, Vol* 7, No. 2). 

Given the importance that is attached to student achievement and the 
widespread use of normative comparisons, Cannell 's findings and conclusions 
deserve close scrutiny. We need to have technically accurate information 
about achievement results reported by school districts across the nation, we 
also need to have a better understmding of the factors which may contribute 
to and explain the findings. 

To achieve these goals, we need your help in collecting infoxrmation from a 
nationally representative sample of school districts that will provide a 
better data base for determining not only what level of student performance is 
being reported, but the uses and interpretations that are being made of the 
results. We also are seeking information about factors that may influence 
test results. 



ERIC 



120 



D-2 



Your district has been selected as part of a nationally representative sample 
for this study, Hence^ your participation is critical to maintaining 
representativeness and drawing conclusions about achievement testing for the 
nation. fi^jmlts will nof- Ha gftpnrt^ftd fo r inriividuAl achnol districts > 

Hr>wAxfiir. p /irf irip^tinn hy mAgh sampled district is f>ft,5ftntial tQ enaucina a n 

agt^urAM» pi^i-nrft for !^hft nation AS a whnlft- 

we ask that you complete the enclosed questionnaire about your district's 
testing program. In many cases^ the information that we are seeking on the 
forms may be provided in reports that have previously been prepared. If sor 
we request that you answer the general questionnaire items and send us the 
questionnaire along with copies of any reports that give results of 
districtwide assessments of student achievement or summaries of district 
results that have been published within the past t^ 3 years • We will use 
those reports to obtain the requested information. Copies of press releases 
and newspaper articles about the test results would also be useful. 

Please return the completed questionnaire in the enclosed envelope to: 

Robert L. Linn 
School of Education 
Campus Box 249 
University of Colorado 
Boulder, CO 80309-0249 

We also ask you to participate in a telephone interview which concerns 
additional questions about testing policies and practices. In order to 
schedule an interview, we ask that you indicate on the questionnaire dates and 
times which would be convenient for one of our staff members to call. The 
interviews consist of fifteen questions about your testing program and usually 
last ab^ut 30 minutes. 

Thank you for your consideration. We realize that school districts receive 
many requests for information and that responding to such requests is a burden 
on your time. Your willingness to help is essential to the success of the 
study and to our ability to provide solid answers to the important educational 
questions that were raised by the Cannell report. 

Sincerely^ 



Eva L. Baker Robert L. Linn 

UCLA University of Colorado-Boulder 

Co-Directors, Center for Research on Evaluation^ Stand.irdSr and 
Student Testing 



o 12i 

ERIC 



August 18, 1988 



(Dj.lL 




?NOT ON DESKTOP] 






•NOT m nRSKTOPl 












•MftT OM nPSKTOPl 


iDisL. 










.MOT OM DESKTOPl 



Dear ^t^r^roy PfA^A♦Nr^'^ pM npCiKTOpI * 

We seek your assistance in a study that is being conducted by the Center for 
Research on Evaluation, Standards, and Student Testing (CRESST) on behalf of 
the U.S. Department of Education's Office of Educational Research and 
Improvement (OERl) . This study was stimulated by the report '•Nationally 
Normed Elementary Achievement Testing in Americans Public Schools: HPW All 
Fi^fy fitAfeg Arft Above Average ** by Dr* John J. Cannell . As you may know, this 
report attracted considerable attention in the press and has been of great 
interest at OERI and among those concerned about the assessment of educational 
achievement ♦ 

Cannell 's findings and conclusions are both provocative and controversial. 
Based on his survery of states and selected school districts^ Cannell 
concluded that "standardized, nationally normed achievement tests give 
children, parents, school systems, legislatures, and the press misleading 
reports on achievement levels'* (p, 6 of special issue of Edugai- innal 
MAA.^iiremAnr_ ! i8,nviea and Practice^ 1988, Vol. 7, ito. 2). 

Given the importance that is attached to student achievement and the 
widespread use of normative comparisons, Cannell 's findings and conclusions 
deserve close scrutiny. We need to have technically accurate information 
about achievement results reported by school districts across the nation. We 
also need to have a better understanding of the factors which may contribute 
to and explain the findings. 

To achieve these goals, we need your help in collecting information from a 
nationally representative sample of school districts that will provide a 
better data base for determining not only what level of student performance is 
being reportedf but the uses and interpretations that are being made of the 
results. He also are seeking information about factors that may influence 
test results. 

Your district has been selected as part of a nationally representative sample 
for this study. Hence, your participation is critical to maintaining 
representativeness and drawing conclusions about achievement testing for the 

nation. Re^ulta will nor ^ repnrti^d ff>r ini^4 v4 rttial inr.hnnl Hiftfrirfa. 



r 

D-4 



fl PpnrAf-ft pi^'fnrA for f-h« naf 1 On ft Whftlft. 

We ask that you complete the enclos^rt questionnaire about your district's 
testing program. In many cases, the infortnation that we are seeking on the 
forms may be provided in reports that have previously been prepared. If so, 
we request that you answer the general questionnaire items and send us the 
questionnaire along with copies of any reports that give results of 
districtwide assessments of student achievement or summaries of district 
ri.aults that have been putilished within the past three years, we will use 
those reports to obtain the requested information. Copies of press releases 
and newspaper articles about the test r*»-ults would also be useful. 

Please return the completed questionnaire in the enclosed envelope to: 

Robert 1>. Linn 
Schoo?. of Education 
Campus aox 249 
University of Colorado 
Boulder, CO 80309-0249 

Thank you for your consideration. We realize that school districts receive 
many requests for information and that responding to such requests is a burden 
on your time. Your willingness to help is essential to the success of the 
study and to our ability to provide solid answers to the important educational 
questions that were raised by the Cannell report. 



Sincerely* 



Eva L. Baker Robert L. Linn 

y^j^^ University of Colorado-Bouider 

Co-Directors, Center for Research on Evaluation, Standards, and 
Student Testing 



ERIC 



123 



District Testing Information 



District Name 



Person Supplying information 



Staie 



Address 



Phone Number 



Title 



1 9SUng T9ar 








Form 


Ydar First 
tJsad 


NorminG 
Year 


Tosting 
Dates 


TvDd of 
Scores 




















1985-1986 


















1986-1987 


K 
















1987-1988 




































1985-1986 


















1986-1987 


1 
















1987-1988 




































1985-1986 


















1986-1987 


2 
















1987-1988 




































1985-1986 








* 










1986-1987 


3 
















1987-1988 




































1985-1986 


















1986-1997 


4 
















1987-1988 




































1985-1986 


















1986-1987 


5 
















1987-1938 




































1985-1986 


















1986-1987 


6 
















1987-1988 





































erJc 



12'; 





















1 QA7 


7 
















1 987-1 988 






















































1 Qtiet 1 Qft7 
1 »pO- I 5rO / 


o 
o 
















1 987'1 988 




































1 Qflls. 1 Qftft 
t 90D* 1 *fQO 




















Q 
















1 987*1 988 






















































1 WOP" 1 T90 f 


1 0 
















^87-1 988 




































1 %yO%>* I SfQw 


















1 QfiR. 1 087 


1 1 
















1 987-1 988 




































1 y oD- 1 900 


















1986-1 987 


1 2 
















1987-1988 



















D-7 



Piat8« R«fer to Explanation of Information Raquettod • Attached 



10 1 1 t2 13 14 



NumtMr of Students Number of Students Number of Students' Reading: % of Students Math: % of Students 
Enrolled Tested Scores Reported above National SO%ite above National 50%ile 



12i) 



D*8 



2 

8-11. Please indicate below the name of the test used at each grade level tested, (for standardized tests» 
include edition and fonn). the nun^twr of students tested, AND THE PERCENT OF STUDENTS ABOVE THE 
NATIONAL 50TH PERCENTILE. (If the percent of Students above the national 50th percentile is not available, 
please provide as nnuch of the information on pages 4 and 5 as possible.) 







8 


9 


10 


1 1 


Testing Year 


Grade 


Test Name, cotton 
ana rorm 


Number of SttKtonts 
1 esieo 


Reading: % of Students 
aoove iMauonai 9U79ii9 


Math: % of Students 

OUOV9 rMaiiunsi 9U vbne 


1 yoD- 1 ypp 












1 QQC 1 Qtt7 

I qO* I sf 0 f 


FN 










1 OQ 7 t OOD 

1 y<5 / • I yoo 
























1 9q5* 1 ypp 














1 

i 


































1 Qfi^. 1 QRft 

1 90 w * 1 «rO w 












1 986*1 987 


2 

ftp 










1987-1988 
























1 98S-1 986 












1986-1987 


3 










1 987-1 988 
























1 QRC* 1 Oft A 












1 986- 1 987 


4 
■f 










1987-1988 
























1985*1986 












1986-1987 


5 










1987-1988 
























1985-1986 












1986.1987 


6 










1987-1988 

























i27 



D-9 
3 



Testing Year 


Grade 


Te$l Name, Edition 
ami Form 


Number of Students 
Tested 


Reading: % of Students 
above National 50%ile 


Math: % of Students 
above National 50%ile 


198S-1986 












1986-1987 


7 










1987-1988 
























198S-1986 












1986-1987 


8 










1987-1988 
























1985-1986 












1986-1987 


9 










1987-1988 
























1985-1986 












1986-1967 


10 










1987-1988 
























1985-1986 












1986-1987 


1 1 










1987-1988 
























1985-1986 












1986-1987 


12 










1987-1988 













12. Testing Dates (month/year) 

13. Norming year of norm referenced testis) used: 

14. Year these tests were first used in your district: 



128 



I 

DAO 

4 

if the percent of students above the national 50th percentile is provided on pages 2 and 3, pages 4 and 
5 need not be completed. Skip to page 6. 

If the number of students above the national 50th percentile (columns 10 and 

11, pages 2-3) is not known, please provide as much of the following 

information as possible. » 



Testing Year 


Grade 


Reading 

Standard 
Mean Deviation 


Math 

Standard 
Mean Deviation 


Reading Score 
at each percentile 
25 SO 75 


Math Score 
at each percentile 
25 50 75 


1985-1986 
























1986-1987 


K 






















1987-1988 
















































1985-1986 
























1986-1987 


1 






















1987-1988 
















































1985-1986 
























1986-1987 


2 






















1987-1988 
















































1985-1986 
























1986-1987 


3 






















1987-1988 
















































1985-1986 
























1986-1987 


4 






















1987-1988 
















































1985-1986 
























1986-1987 


5 






















1987-1988 
















































1985.1986 
























1986-1987 


6 






















1987-1988 

























ERIC 



D-n 

5 



Testing Year 


Grade 


Reading 

Standard 
Mean Deviation 


Math 

Standard 
Mean Deviation 


Reading Score 
at each percentile 
25 50 75 


Mat^ 
at eai 
25 


1 Score 
ch percc 
50 


tntile 
75 


1985*1986 
























1986-1987 


7 






















1987^1988 
















































1985-1986 
























1 986*1 987 


8 






















1987*1988 
















































1985-1986 
























1986-1987 


9 






















1987-1988 
















































1 985-1 986 
























1 986*1987 


1 0 






















1987.1988 
















































1985-1986 
























1986-1987 


11 






















1987-1988 
















































1985-1986 
























1986-1987 


12 






















1987-1988 

























130 

ERIC 



Alternate Information Availabie 
District Test Results 



D-12 



District Name Person Supplying Information 



State Address Phone Number 



Title 







15 16 


17 18 


19 


20 


Testing Year 


Grade 


Reading 

Standard 
Mean Deviation 


Math 

Standard 
Mean Deviation 


Reading Score 
at each percentile 
25 SO 75 


Math Score 
at eachi percentile 
25 50 75 


1985-1986 












1986-1987 


K 










1987-1988 
























1985-1986 












1986-1987 


1 










1987-1988 
























1985-1986 












1986-1987 


2 










1987-1988 
























1985-1986 












1986-1987 


3 










1987-1988 
























1985-1986 












1986-1987 


4 










1987-1988 






















t 


1985-1986 












1986-1987 


5 










1987-1988 
























1985-1986 












1986-1987 


6 










1987-1988 

























131 



0-13 















1985-1966 












1986.1987 


7 










1987 1988 
























1985-1986 












1986-1987 


8 










1987-1988 
























1985-1966 












1986-1987 


9 










1987-1988 
























1985-1986 












1986-1987 


1 0 










1987-1988 
























1985-1986 












1986-1987 


1 1 










1987-1988 
























1985-1986 












1986-1987 


12 










1987-1988 





































132 

ERIC 



D-14 



Explanation of Information Requested 



Cntiimn Information requested 

1 Testing year 

2 Grade levels tested K - 12. 

3 Name of test used e.g., CTBS. MAT» name of locally developed test. 

4 Edition of the test used at each gradi level, e.g., 1982. 

5 Form of the test used at each grade level. * 

6 Year when test was Hrst used. 

7 Norming year of test used for reporting scores. 
S Month in which tests were administered. 

9 Type of scores reported, e.g.. percent correct, percentile rank. NCE. 



n.b. If you have more than one type of score, please provide one form 
of data in the preferred order as follows: 

Percentile Rank 
Grade Equivalents 
NCE 

Stantnes 
Percent Correct 

• • • 

10 Number of students enrolled: the total number of students enrolled by 
grade 

11 Number of students tested at each grade. 

12 Number of students' scores reported: !f not all scores are used to 
compute rankings or other statewide test results, enter the number of 
students' scores used to compute the achievement data. 

Reading The per rfnt of students icorine above the national 50lh 

pcrccniilc. 

l£ Math %• The perce nt nf stu(jfnt« scoriny above the national 50th 

nercentile. 

n.b. If neither reading nor m ath data requested in 12 and 13 arc gvailablc. t?kaK 
provide the most appropriate comp osite scores and indicate the nature of Ihese 
on the fonn. 



er|c 13 J 



DAS 



If ihc data requested in columns 13 or 14 (percent of students scoring above the 
national 50ih percentile) are not available, please provide as much of the following 
as possible (columns 15 - 20 on the Alternate Information Sheet): 

Column 

15 Reading mean for the district. 

16 Reading standard deviation. 

17 Math mean. 

18 Math standard deviation. 

19 Reading score at each percentile: The score 

• at the 25ih percentile districtwide 

• at the 50th percentile districtwide. 

- at the 75th percentile districtwide. 

20 Math score at each percentile: The math score 

-at the 25th percentile districtwide. 

- at the 50th percentile districtwide. 

• at the 75th percentile districtwide. 



Type of scores: If the type of scores reponed in columns 13*20 arc not 
the same as those indicated in column 9, please indicate the type of 
scores used to compute the percentiles, mean, and standard deviations. 



Appendix E 

District Subsaaple for Telephone Interviews 



E-I 



Appendix E 



District Subsample for Telephone Interviews 

The 40 cells (5 levels of SES by 8 levels of 
district size) within each of the 4 regions that were used 
to define the overall district sample were collapsed to 15 
cells (3 levels of SES by 5 levels of district size) to 
select the subsample to be interviewed by telephone. The 
following levels were combined for each factor. 





SES 


Size 




Subsample 
Level 


Total Sample 


Subsample 
Level 


Total Sample 
Level 




1 

Below 
Average 


Low & Below 
Average 


1 

<2,500 


<i,200 & 
1,200-2,499 


2 

Average 


Average 


2 

2,500-9,999 


2,500-4,999 & 
5,000-9,999 


3 

Above 
Average 


Above Average 
& High 


3 

10,000-49,999 


10,000-24,999 & 
25,000-49,999 






4 

50,000-99,999 


50,000-99,999 






5 

100,000 + 


100,000 + 



For cells of the subsample design that consisted of 
2 or 4 of the cells of the total sample, one district was 
randomly selected. The SES « 1, size 1 cell of the 
interview subsample, for example, consists of SES by size 
cells 11, 12, 21, and 22 in the total sample. A random 
number between 1 and 4 corresponding to each of xihose 
original cells was selected for each region. Following this 
procedure for each of the interview subsample cells that 
contained more than one cell from the total sample, 56 
districts (4 regions x 3 SES levels x 5 size levels minus 4 
void cells) for the interview subsample were selected. 



136 



Appendix E (Continued, page 2 of 2) 

Using the total sample code R2S where 
R » region (1 « East, 2 « North/Central, 3 * South, 

and 4 « West) t 
Z » size (1 « less than 1,200, 2 « 1,200-2,499, 3 = 

2,500-4,999, 4 - 5,000-9,999, 5 » 10,000- 

24,999, 6 » 25,000-49,999, 7 « 50,000-99,999, 

and 8 - 100,000 or »ore) ; and 
S » SES (1 - low, 2 - below average, 3 - average, 4 

e above average, and 5 » high) , 
the following interview subsample was selected. 



112 


211 


312 


411 


123 


213 


323 


415 


124 


225 


324 


423 


131 


233 


332 


432 


134 


242 


335 


433 


145 


245 


343 


445 


153 


251 


353 


454 


155 


255 


362 


462 


161 


263 


365 


463 


172 


272 


371 


471 


173 


273 


373 


474 


174 


275 (void) 


374 


474 


181 


282 


362 


481(void) 


183 (void) 


283 


383 


483 


184 


285{void) 


385 


484 




Appendiix F 



Grades Tested by Districts Returning Data 



o 138 

ERIC 



F-l 



Appendix F (page 1 of 4) 
Grades Tested by Districts Returning Data 

Grade 



lion 


Size 


SES 


K 1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 12 


X 


X 


X 


+ 




+ 






+ 












X 


X 


3 


+ 




+ 


+ 
















1 


X 


4 








•4- 


+ 


+ 


+ 










1 


2 


X 


+ 


+ 


+ 


+ 


+ 




+ 




+ 






1 


2 


2 




+ 




+ 








+ 




+ 




1 


2 


4 




+ 












+ 








1 


2 


5 






+ 


+ 


+ 














1 


3 


X 






+ 




+ 


+ 


+ 


+ 










3 


2 


+ 


+ 


+ 


+ 


+ 




+ 












3 


3 






+ 


+ 


+ 














1 


3 


4 


















+ 


+ 




1 


3 


5 


+ 




+ 




+ 


+ 


+ 


+ 








1 


4 


2 


+ + 




+ 


+ 


+ 


+ 


+ 


+ 








X 


4 


3 






+ 




+ 




+ 






+ 




X 


4 


4 


+ 




+ 




+ 




■f 


+ 


+ 


+ 




X 


4 


5 








+ 


+ 








+ 






X 


5 


3 






+ 




+ 






+ 








X 


5 


4 












+ 












X 


5 


5 


Interview 


completed 


— no nomed 


test results 


X 


6 


1 


+ 




+ 


+ 






+ 






t- 




X 


6 


2 




+ 


+ 




+ 




+ 


+ 




+ 


+ + 


X 


6 


4 






+ 




+ 






+ 








X 


6 


5 










+ 






+ 






+ 


X 


7 


1 


+ + 


+ 






+ 




+ 


+ 




+ 




X 


7 


2 






+ 




+ 


+ 


+ 


+ 








X 


7 


2 


+ 


+ 


■f 




+ 




+ 




+ 


+ 


+ + 


X 


7 


3 




+ 








+ 




+ 


+ 






X 


7 


4 






+ 




+ 






+ 








X 


8 


X 


+ 


+ 


+ 


+ 


+ 


+ 


+ 




+ 




+ 


X 


d 


2 






+ 










+ 


+ 


+ 




X 


8 


2 








+ 


+ 








+ 


+ 




X 


8 


3 










+ 






+ 








X 


8 


4 






+ 




+ 






+ 








X 


8 


5 






+ 




+ 






+ 






+ 


2 


X 


X 


•f 




+ 


+ 






+ 


+ 




+ 




2 


X 


3 


+ + 








+ 


+ 


+ 


+ 




+ 




2 


X 


4 




+ 


+ 


+ 


+ 


4- 


+ 


+ 


+ 




+ + 


2 


X 


5 


+ + 




+ 


+ 


+ 




+ 










2 


2 


4 


+ 




+ 


+ 


+ 


+ 


+ 










2 


2 


5 






+ 


+ 


+ 


+ 


+ 






+ 




2 


3 


X 










+ 










+ 




2 


3 


2 


Questionnaire 


completed 




no usable 


test results 


2 


3 


3 






+ 




+ 














2 


3 


4 






+ 


+ 


+ 




+ 










2 


4 


X 


+ + 






■f 


+ 


+ 


+ 


+ 


+ 


+ 


+ + 



i3r. 



ERIC 



F-2 



Appendix F (page 2 of 4) 

Grade 



lion Size SES 


K 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


2 


4 


2 




+ 


+ 


+ 




+ 


+ 




+ 


+ 


+ 


+ 




2 


4 


3 


















+ 










2 


4 


4 










+ 


+ 








+ 








2 


5 


2 




+ 




+ 




+ 


+ 


+ 


+ 




+ 




4- 


2 


d 






+ 


+ 


+ 












+ 






4- 


2 


c 


4 


Criterion 


Referenced Test 


results only 








2 


c 
9 


c 
D 








+ 






+ 




+ 


+ 










o 








+ 




+ 




+ 


+ 


+ 




> 






2 




J 






+ 


+ 


+ 


+ 




+• 


+ 










2 


















+ 




+ 










2 




e 
D 








+ 






+ 














2 


o 




















+ 










2 


7 


X 


+ 


+ 


+ 


+ 






+ 


+ 


+ 


+ 






+ 


2 


•7 




+ 


+ 


+ 


+ 


+ 


+ 


+ 


+ 


+ 


+ 




+ 




2 


i 






+ 


+ 




+ 


+ 


+ 




+ 


+ 








2 


7 






+ 




+ 


+ 


+ 




+ 




■r 








2 


1 




+ 




+ 






+ 










+ 








7 


•J 






+ 


+ 


+ 


+ 


•f 


+ 


+ 


■ 










Q 
Q 




+ 






+ 




+ 












+ 






O 

o 


># 




+ 




+ 


+ 


+ 




+ 


+ 












1 

X 


1 

X 






+ 


+ 


+ 


+ 






+ 


+ 






+ 


3 


1 


2 


+ 


+ 


+ 




+ 




+ 




4* 












X 




+ 


+ 


+ 


+ 


+ 




+ 
















X 








+ 


+ 


+ 


+ 


+ 




+ 






+ 








X 




•f 


+ 




+ 


+ 


+ 






+ 








*> 






+ 


+ 


+ 


+ 


+ 


•¥ 


+ 


+• 


+ 


















+ 


+ 


+ 


+ 


+ 


+ 


+• 




+ 




+ 








*» 




+ 


+ 




+ 


+ 


+ 


■♦- 


+ 














X 


+ 


+ 




+ 


+ 


+ 


+ 


•f 




+ 








3 






+ 




+ 


+ 


+ 




+ 


•r 












-1 
4 








+ 


+ 


+ 


+ 


+ 




+ 


+ 




+ 








-> 


A 
■t 




+ 






+ 




+ 


















e 




+ 


+ 




+ 


+ 










+ 






•1 




X 




+ 


•I- 
































+ 


+ 


+ 




+ 


+ 


+ 




+ 


+ 






4 


3 




+ 


+ 




+ 




+ 




+ 










3 


4 


4 


+ 


+ 


+ 


+ 




+ 


+ 




+ 






+ 




3 


4 


5 


+ 


+ 


+ 


+ 


+ 


+ 




+ 












3 


5 


1 










+ 




+ 














3 


5 


3 










+ 






+ 




+ 




4- 




3 


5 


4 




+ 


+ 


+ 


+ 




+ 




+ 




+ 


+ 




3 


5 


5 




+ 


+ 


+ 


+ 


+ 


+ 










+ 




3 


6 


1 


















+ 










3 


6 


2 








+ 


+ 


















3 


6 


3 








+ 


+ 




+ 










+ 




3 


6 


4 










+ 








+ 










3 


6 


5 










+ 








+ 











ERIC 



140 



F.3 



Appendix F (page 3 of 4) 

Grade 



Region Size SES 


K 1 


2 


3 


4 


5 


D 


/ 


Q 

o 


9 




XX 


X2 


3 


7 


1 


+ 










A_ 






X 


A 


X 




3 


7 


2 


+ 


+ 
















A 






3 


7 


2 


+ + 


+ 


■f 




+ 


A 


A 


A 
+ 


A 


A 


A 


"T 


3 


7 


2 


+ 




-f 




+ 


a 
+ 


A 

-r 




A 








3 


7 


3 














■A. 

■r 




A- 






A 


3 


7 


3 


+ 




•f 






1 




1 


■ 








3 


7 


3 








+ 




+ 


+ 












3 


7 


3 


+ 




+ 






A 


A 












3 


7 


3 


+ 














A 

■r 










3 


7 


4 


+ + 






+ 










A 
+ 








3 


7 


4 
















■f 






A 
+ 




3 


7 


5 




+ 




+ 




1 


A 

*r 




A 








3 


8 


2 




+ 




A 
+ 


A 


A 


1. 


■L 


A 


A 






3 


8 


3 


a 1 
+ + 






A 


1 




1 


A 


A 


A 


A 


1 


3 


8 


3 


i 
+ 




A 


1 


A- 






J, 


X 


X 


X 


X 


3 


8 


3 


i i 
+ + 


A 


A 


A 


A 








X 


t 

T 


X 




3 


8 


3 


+ + 


i 
+ 


A 
+ 


A 
+ 


A 




A 


T 


X 


X 


X 




3 


8 


5 




I 




t 




A 










X 




4 


1 


1 


1 


1 


A 


1- 


















4 


1 


2 


1 1 

•r + 


A 


A 


A 


A 






X 


X 


X 


X 




4 


1 


3 


A 
+ 


* 


J 

■r 




A, 

•r 


A- 




X 










4 


2 


2 




A 

•r 




A 






*T 


A 






X 




4 


2 


4 
















X 




X 






4 


2 


4 


+ 


A 

•r 












X 










4 


2 


5 








A 
+ 




A 


*T 


X 




X 






4 


3 


X 


+ ▼ 




+ 


1 




1 




A 


A 


A 


A 




4 


3 


2 


+ 


+ 


+ 














T 






4 


3 


3 


+ 


+ 


+ 




+ 








X 


X 


X 




4 


3 


4 








A 












X 






4 


3 


5 


+ 


+ 


+ 


A. 


+ 


+ 














4 


4 


1 


+ 




+ 




















4 


4 


2 


+ 


+ 


+ 


A 










X 


A 


X 


X 


4 


4 


4 






+ 


A 








+ 










4 


4 


5 


+ + 


+ 


+ 


A 


+ 




+ 












4 


5 


1 


+ 








+ 


+ 


+ 


+ 




1 

■r 




A 


4 


5 


2 


+ + 


+ 


+ 








+ 




• 

+ 








4 


S 


3 


Only Chapter I 


test 


data provided 










4 


5 


4 


+ 




■¥ 


+ 


+ 


+ 














4 


5 


5 






+ 


+ 




-I- 


-4- 












4 


6 


1 




+ 


+ 


+ 


+ 




+ 




+ 


+ 






4 


6 


2 




+ 








+ 




+ 










4 


6 


3 


+ 


+ 


+ 




















4 


6 


4 






+ 


+ 








+ 


+ 








4 


6 


5 


4- + 


+ 


+ 


+ 




+ 


+ 




+ 








4 


7 


1 




+ 


+ 








+ 




+ 




+ 




4 


7 


2 




+ 


+ 






+ 


+ 




+ 


+ 






4 


7 


3 


+ + 




+ 






+ 















ERIC 



141 



F-4 



Appendix F (page 4 of 4) 

Grade 



Region Size SES 


K 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


4 


7 


3 








+ 




+ 
















4 


7 


3 


■I- 


+ 


+ 


+ 


+ 






+ 


+ 


+ 


+ 


+ 


+ 


4 


7 


3 




+ 


+ 


+ 


+ 


+ 


+ 




+ 




+ 




+ 


4 


7 


3 


Criterion 


Referenced Test 


results 


only 








4 


7 


4 


+ 


+ 








+ 


+ 


+ 


+ 


+ 




+ 


+ 


4 


7 


4 




















a. 








4 


7 


4 














+ 




+ 


+ 




+ 




4 


7 


4 


+ 






+ 


+ 


+ 


+ 








+ 


+ 




4 


7 


4 


+ 


+ 






+ 


+ 


+ 


+ 




+ 




+ 




4 


7 


5 








+ 




+ 


+ 




+ 


+ 




+ 




4 


8 


3 




+ 


+ 


+ 




+ 


+ 






+ 








4 


8 


3 


+ 


+ 






+ 


+ 


+ 


+ 


+ 




+ 






4 


8 


3 


+ 


+ 


+ 


+ 


+ 


+ 


+ 


+ 


■f 


+ 


+ 


+ 


+ 


4 


8 


4 
















+ 












Totals 




153 


43 


40 


111 


123 


123 


123 


118 


104 


120 


82 


74 


66 


26 



ERIC 



* 142 



Appendix G 

Stem-and-Leaf Distributions o£ District Reports of 
the Percent of Students Scoring Above the National 
Median in Reading and Mathematics 



143 



Appendix G 
Figure G-1 

Stem-and-Leaf Distribution of the District Percents of Students 
Scoring Above the National Median at Grade 1 



Reading 



Mathematics 



Stem 


Leaf 


Counx: 


Stem 


Leaf 


Count 


9 : 


, 6 


1 


9 : 


589 


3 


9 : 


01 


2 


9 : 


3 


1 


8 : 


9 


1 


8 : 


9 


1 


8 : 


: 013 


3 


8 : 


034 


3 


7 : 


, 588 


3 


7 : 


. 55678 


5 


7 : 


: 34 


2 


7 : 


. 0113 


4 


6 ! 


: 55689 


5 


6 : 


; 6899999 


7 


6 : 


! 012224 


6 


6 : 


: 001223344444 


12 


5 : 


; 5567789 


7 


5 J 


; 5899 


4 


5 : 


: 001224444 


9 


5 ; 


: 012 


3 


4 . 


: 5579 


4 


4 ; 


; 669 


3 


4 


: 0134 


4 


4 . 


: 34 


2 


3 


: 56689 


5 


3 


: 88 


2 


3 


: 0023 


4 


3 


; 02 


2 


2 


: 6 


1 


2 


: 89 


2 


2 




0 


2 


: 2 


1 


1 




0 


1 




0 


1 




0 


1 




0 



P90 - 81 
P75 = 66 
P50 = 55 
P25 » 45 
PIO » 35 



P90 
P75 
P50 
P25 
PIO 



84 
71 
64 
51 
38 



ERIC 



144 



G-2 



Appendix G 
Figure G-2 

Stea-and-Leaf Distribution of the District Percents of Students 
Scoring Above the National Median at Grade 2 



Reading 



Mathematics 



Stem 


Leaf 


Count 


Stem 


Leaf 


Count 








9 : 


; 559 


3 


9 ; 


: 12 


2 


9 ; 


\ ol^ 


3 


8 


! 577 


3 


8 : 


! 67 


2 


8 ' 


: 0012 


4 


8 : 


: 001334 


6 


7 ! 


i 5799 


4 


7 ! 


; 5779 


4 


7 : 


: 12 


2 


7 i 


: 0001212222344 


13 


6 i 


: 555688899 


9 


6 : 


; 55566788889 


11 


6 : 


: 0012344 


7 


6 I 


: 000011222 


9 


5 ; 


! 56677788999 


IX 


5 ! 


; 56677889 


8 


5 I 


: 0122334444 


10 


5 : 


: 001124 


6 


4 ; 


! 557778899 


9 


4 1 


: 568 


3 


4 : 


: 111123344 


9 


4 : 


! 23 


2 


3 : 


: 999 


3 


3 ; 


: 6 


1' 


3 ; 


: 1 


I 


3 ; 


; 4 


1 


2 ! 


1 99 


2 


2 : 




0 


2 J 


: 2 


1 


2 : 




0 


1 1 




0 


1 ! 


; 68 


2 


1 : 




0 


1 ! 




0 



P90 = 80 
P75 - 68 
P50 = 57 
P25 » 47 
PIO » 41 



P90 
P75 
P50 
P25 
PIO 



86 
74 
67 
57 
46 



ERIC 



14u 



Appendix G 



Figure G-3 

Stem-and-Leaf Distribution of the District Percents of Students 
Scoring Above the National Median at Grade 3 



Reading 



Stem Leaf Count 



9 : 




0 


9 : 


34 


2 


8 : 


558 


3 


8 : 


• 12 


2 


7 : 


; 56799 


5 


7 i 


; 0122344 


7 


6 : 


: 677777789 


9 


6 : 


: 00111224444444 


14 


5 : 


: 5566677899 


10 


5 : 


: 001233344 


9 


4 : 


: 556889 


6 


4 : 


; 001223 


6 


3 


: 69 


2 


3 , 


: 012223344 


9 


2 


: 89 


2 


2 


: 14 


2 


1 


: 5 


1 


1 




0 



P90 » 78 
P75 « 67 
P50 = 58 
P25 = 45 
PIO = 32 



Mathematics 



Stem Leaf Count 



9 J 




0 


9 : 


123 


3 


8 : 


7899 


4 


8 : 


1 012224 


6 


7 : 


; 88 


2 


7 : 


; 000112244 


9 


6 ! 


; 556778888899 


12 


6 : 


; 000123344444 


12 


5 : 


I 55567788999 


11 


5 : 


! 1222333444 


10 


4 ' 


: 556667899 


9 


4 


: 0224 


4 


3 


: 69 


2 


3 


; 334 


3 


2 




0 


2 


: 0 


1 


1 




0 


1 


: 1 


1 



P90 » 82 
P75 ■* 70 
P50 - 61 
P25 » 52 
PIO » 42 



146 



Appendix G 
Figure G-4 

Stem-and-Leaf Distribution of the District Percents of Students 
Scoring Above the National Median at Grade 4 



Reading 



Stem Leaf Count 



9 i 


! 5 


X 


9 ; 


1 00 


2 


8 - 


: 79 


2 


8 ; 


: 001 


3 


7 : 


: 67799 


5 


7 ; 


; 00133444 


8 


6 : 


I 6888 


4 


6 ; 


; 000022234 


9 


5 : 


: 5567788899 


10 


5 : 


: 01112222244 


11 


4 : 


: 66777899 


8 


4 : 


; 013444 


6 


3 : 


: 5568889 


7 


3 : 


: 12234444 


8 


2 : 


: 7 


1 


2 : 


: 1 


1 


1 




0 


1 ! 


: 1 


1 



P90 » 79 
P75 « 68 
P50 « 55 
P25 « 44 
PIO « 34 



Mathematics 



Stem Leaf Count 



9 i 


: 9 


1 


9 : 


; 034 


3 


8 ; 


: 69 


2 


8 : 


: 0033 


4 


7 ! 


: 589 


3 


7 ! 


\ 024 


3 


6 : 


5 5557777888889 


13 


6 : 


; 0000012223344 


13 


5 ! 


: 55556667778 


11 


5 : 


: 0011222222333344 


16 


4 : 


! 55789 


5 


4 : 


: 011224 


6 


3 : 


: 5579 


4 


3 1 




0 


2 




0 


2 : 




0 


1 




0 


1 : 


2 


1 



P90 « 81 
P75 « 68 
P50 = 59 
P25 = 52 
PIO « 42 




G-5 



Appendix G 
Figure G-5 

Stem-and-Leaf Distribution of the District Percents of students 
Scoring Above the National Median at Grade 5 



Reading 



Mathematics 



stem 


Leaf 


Count 


Stem 


Leaf 


Count 


9 : 




0 


9 : 


6 


1 




: 03 


2 


9 : 


0013 


4 


8 : 


: 5 


1 


8 : 


. 6 


1 


8 : 


; 00112333 


8 


8 : 


. 002234 


6 


7 ; 


; 55578 


5 


7 ! 


: 55777899 


8 


7 : 


: 0011223344 


10 


7 ; 


: 02244 


5 


6 ! 


: 5699 




6 : 


; 66677778888899 


14 


6 : 


; 00112224 




6 1 


: 111122344444 


12 


5 ! 


: 666667788 


9 


5 : 


: 556677899 


9 


5 : 


\ 0001122233 


10 


5 ; 


: 002222244 


9 


4 : 


: 567888999 


9 


4 . 


: 5667888899 


10 


4 ; 


\ 11244 


5 


4 , 


: 1344 


4 


3 : 


: 55567799 


8 


3 


: 57 


2 


3 


: 02334 


5 


3 


; 2 


1 


2 ; 


: 679 


3 


2 




0 


2 




0 


2 


: 2 


1 


1 


: 9 


1 


1 




0 


1 




0 


1 




0 



P90 
P75 
P50 
P25 
PIO 



80 
72 
56 
45 
34 



P90 
P75 
P50 
P25 
PIO 



82 
73 
64 
52 
45 



ERIC 



148 



G-6 



Appendix G 
Figure G-6 

Stem-and-Leaf Distribution of the District Percents of Students 
Scoring Above the National Median at Grade 6 



Reading 



Mathematics 



P90 
P75 
P50 

PIO 



75 
65 
54 
42 
35 



Stem 


Leaf 


Count 


Stem 


Leaf 


Count 


9 ; 




0 


9 ! 


! 79 


2 


9 : 


; 2 


1 


9 ; 


; 4 


1 


8 ; 


: 69 


2 


8 : 


I 556 


3 


8 : 


: 0234 


4 


8 ! 


I 1444 


4 


7 1 


! 55556 


5 






<r 
w 


7 ! 


: 0001234 


7 


7 : 


: 123 


3 


6 ; 


: 5555589 


7 


6 ; 


; 5566888999 


10 


6 : 


: 0144 


4 


6 ' 


: 022222222334444 


15 


5 : 


; 66677777889 


11 


5 ! 


r 55556667788999 


14 


5 : 


! 001223334 


9 


5 ! 


: 0011123 


7 


4 : 


; 555678999 


9 


4 : 


; 5556677889 


10 


4 : 


; 0122234 


7 


4 : 


: 22244 


5 


3 ; 


: 56666677889 


11 


3 ; 


; 89 


2 


3 : 


00024 


5 


3 ! 




0 


2 : 


: 69 


2 


2 : 




0 


2 : 




0 


2 : 


: 3 


1 


1 I 




0 


1 ; 




0 


1 ; 


i 2 


1 


1 : 


: 2 


1 



P90 
P75 
P50 
P25 
PIO 



84 
69 
62 
50 
44 



ERIC 



14.; 



Appendix G 



Figure G-7 

Stem-and-Leaf Distribution of the District Percents of Students 
Scoring Above the National Median at Grade 7 



Reading 



Stem Leaf Count 



9 : 0 

9:0 1 

8:7 1 

8 : 13 2 

7 : 556.9 5 

7 : 00^ 4 4 

6 : 57789 5 

6 : 001112333 9 

5 : 566778 6 

5 ; 0011223344 10 



4 : 577799 6 

4 : 0334 4 

3 : 7778999 7 

3 : 0024 4 

2 : 68899 5 

2 : 0 

1 : 0 

1:0 1 



P90 = 75 
P75 = 64 
P50 « 54 
P25 » 40 
PIO » 30 



Mathematics 



Stem Leaf Count 



9 ! 




0 


9 ! 


1 0333 


4 


8 : 


; 6 


1 


8 : 


: 00034 


5 


7 : 


: 8 


1 


7 : 


1 003 


3 


6 ; 


; 6677777789 


10 


6 : 


I 0011123334 


10 


5 1 


I 5667777899 


10 


5 ! 


; 23344 


5 


4 : 


t 56778889 


8 


4 : 


; 00022234 


8 


3 ! 


: 66788 


5 


3 : 




0 


2 : 


; 8 


1 


2 : 




0 


1 1 


! 9 


1 


1 : 




0 



P90 « 80 
P75 » 67 
P50 o 59 
P25 « 47 
PIO - 39 



150 



Appendix G 



Figure G-8 

Stem-«4nd-Leaf Distribution of the District Percents of student 
Scoring Above the National Median at Grade 8 



Reading 



Stem Leaf Count 



9 ' 




0 


9 , 




0 


8 : 


: 56 


2 


8 : 


5 233 


3 


7 : 


: 67889 


5 


7 : 


: 001233 


6 


6 J 


; 555667889 


9 


6 : 


I 0011234 


7 


5 : 


; 55567777899 


11 


5 ; 


: 011123334 


9 


4 : 


: 5667778 


7 


4 : 


: 0011244 


7 


3 : 


: 667789 


6 


3 ; 


: 11233344 


8 


2 ; 


: 899 


3 


2 1 




0 


1 ! 


: 9 


1 


1 : 




0 



P90 = 77 
P75 « 66 
P50 « 55 
P25 » 41 
PIO ■ 33 



Kathestatics 



stem Leaf Count 



9 : 




0 


9 : 


: 1 


1 


8 : 


; 57 


2 


8 ; 


; 00223^ 


6 


7 : 


: 5666/88 


7 


7 : 


; 023334 


6 


6 : 


: 56^^79 


5 


6 : 


: 1111222233344 


13 


5 : 


; 677788999 


9 


5 ; 


; 12444444 


8 


4 : 


: 5589999 


7 


4 : 


0133444 


7 


3 : 


s 55666789 


8 


3 ; 


: 0044 


4 


2 : 




0 


2 ; 




0 


1 : 




0 


1 : 


: 01 


2 



P90 = 79 
P75 » 70 
P50 « 59 
P25 « 45 
PIO - 36 



G-9 



Appendix G 
Figure G-9 

Stem-and-Leaf Distribution of the District Percents of Students 
Scoring Above the National Median at Grade 9 



Reading 



Stem Leaf Count 



9 : 




0 


9 : 


2 


1 


8 : 




0 


8 : 


: 3 


1 


7 : 


; 779 


3 


7 J 


: 2 


1 


6 : 


: 6889 


4 


6 : 


t 1113 


4 


5 ; 


; 566777789 


9 


5 : 


: 00111113 


8 


4 : 


: 566899 


6 


4 : 


r 001112344 


9 


3 ! 


: 55668 


5 


3 ; 


: 22344 


3 


2 ; 


: 8 


1 


2 : 


: 014 


3 


1 


: 6 


1 


1 




0 



P90 = 69 
P75 « 58 
P50 « 50 
P25 » 39 
PIO • 32 



Mathematics 



Stem Leaf Count 



9 : 




0 


9 : 




0 


8 : 


; 6699 


4 


8 : 




0 


7 : 


; 559 


3 


7 : 


; 1233 


4 


6 : 


: 5777 


4 


6 : 


; 00012234 


8 


5 : 


: 589 


3 


5 : 


: 00011344 


8 


4 , 


: 5568999 


7 


4 : 


: 12344 


5 


3 


: 669 


3 


3 . 


I 0034 


4 


2 


: 79 


2 


2 


: 01 


2 


1 




0 


1 




0 



P90 = 75 
P75 - 65 
P50 = 53 
P25 « 44 
PIO « 30 



152 



Appendix G 



Figure G-10 

Stem-and~Lea£ Distribution of th» District Percents of students 
Scoring Above the National Median at Grade 10 



Reading 



Stem Leaf Count 



9 ! 






0 


9 : 






0 


8 : 






0 


8 : 




4 


1 


7 1 




5 


1 


7 : 




00334 


5 


6 I 




568 


3 


6 : 




00123 


5 


5 ; 




667 


3 


5 : 




02344 


5 


4 { 




55677889 


8 


4 : 




0133444 


7 


3 : 




7789 


4 


3 J 




01344 


5 


2 : 




578 


3 


2 : 




0 


1 


1 ; 




5 


1 


1 ; 






0 


P90 


= 71 




P75 


= 61 




P50 


a 48 




P25 


« 38 




PIO 


- 23 





Mathematics 



Stem Leaf Count 



9 : 




0 


9 I 


1 0 


1 


8 : 


: 55 


2 


8 


; Oil 


3 


7 : 


; 56 


2 


7 ; 


: 02 


2 


6 : 


! 559 


3 


6 ; 


: 0114 


4 


5 ! 


: 556777789 


9 


5 ! 


: 134 


3 


4 i 


; 689 


3 


4 : 


I 1233334 


7 


3 : 


5678888 


7 


3 I 


! 04 


2 


2 : 




0 


2 : 




0 


1 : 




0 


1 ! 


! 0 


1 



P90 » 80 
P75 = 65 
P50 « 55 
P25 43 
PIO - 36 



G-11 



Appendix G 
Figure G-11 

Stem-and-Leaf Distribution of the District Percents of Students 
Scoring Above the National Median at Grade 11 



Reading 



Item 


Leaf 


Count 






0 


9 : 




0 


8 i 


: 6 


1 


8 : 


; 0 


1 


7 : 


: 579 


3 


7 : 


; 0144 


4 


6 : 


; 5 


1 


6 : 


011223 


6 


5 ; 


I 678 


3 


5 : 


001123344 


9 


4 : 


: 567 


3 


4 : 


: 113 


3 


3 : 


: 55889 


5 


3 


: 123 


3 


2 




4 


2 


: 1 


I 


1 


: 9 


1 


1 


: 0 


1 



P90 = 75 
P75 = 62 
P50 » 52 
P25 « 38 
PIO « 27 



Mathematics 



Stem Leaf Count 



9 : 


6 


1 


9 : 




0 


8 i 




0 


8 : 


; 023 


3 


7 ; 


; 599 


3 


7 ! 


: 22 


2 


6 : 


I 67899 


5 


6 : 


: 01233334 


8 


5 ! 


: 66889 


5 


5 : 


! 00 


2 


4 : 


: 578 


3 


4 


I 244 


3 


3 , 


: 5558999 


7 


3 


: 114 


3 


2 




0 


2 




0 


1 




0 


1 


: 0 


1 



P90 a 79 
P75 « 68 
P50 » 59 
P25 « 42 
PIO - 35 



154 



G-12 



Appendix G 
Figure G-12 

Stem-and-Leaf Distribution of the District Percents of Students 
Scoring Above the National Kedian at Grade 12 



Reading 



Mat^ienatics 



P90 
P75 
P50 
P25 
PIO 



75 
58 
50 
40 
21 



Stem 


Leaf 


Count 


Stem 


Leaf 


Count 


9 I 




0 


9 1 


; 5 


1 


9 ! 




0 


9 : 




0 


8 : 




0 


8 : 




0 


8 : 




0 


8 : 




0 


7 ! 


', 79 


2 


7 ! 




0 


7 i 


; 24 


2 


7 ! 


: 02 


2 


6 : 




0 


6 : 


; 789 


3 


6 : 


2 


1 


6 : 


: 0 


1 


5 s 


\ 888 


3 


5 : 


t 77 


2 


5 ; 


: Oil 


3 


5 I 


: 4 


1 


4 : 


\ 88 


2 


4 : 


: 5589 


4 


4 ; 


Oil 


3 


4 ; 


: 14 


2 


3 : 


: 6 


1 


3 ! 


I 6 


1 


3 : 


: 3 


1 


3 ! 


I 4 


1 


2 : 


: 7 


1 


2 : 




0 


2 : 


! 1 


1 


2 : 




0 


1 . 




0 


1 ; 




0 


1 ; 


: 3 


1 


1 ; 


: 0 


1 



P90 
P75 
P50 
P25 
PIO 



71 
67 
55 
45 
35 



ERIC 



I5i 



